Reducing Energy Consumption Through Image Compression

(1)

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Reducing Energy Consumption Through Image

Compression

Examensarbete utfört i Datateknik

vid Tekniska högskolan vid Linköpings universitet av

Mats Ferdeen LiTH-ISY-EX--16/4995--SE

Linköping 2016

Department of Electrical Engineering Linköpings tekniska högskola

Linköpings universitet Linköpings universitet

(2)

(3)

Reducing Energy Consumption Through Image

Compression

Examensarbete utfört i Datateknik

vid Tekniska högskolan vid Linköpings universitet

av

Mats Ferdeen LiTH-ISY-EX--16/4995--SE

Handledare: Andreas Ehliar

isy_{, Linköpings universitet}

Examinator: Oscar Gustafsson

isy, Linköpings universitet

(4)

(5)

Avdelning, Institution Division, Department

Avdelningen för Datorteknik Department of Electrical Engineering SE-581 83 Linköping Datum Date 2016-007-15 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport

URL för elektronisk version

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-134335

ISBN — ISRN

LiTH-ISY-EX--16/4995--SE Serietitel och serienummer Title of series, numbering

ISSN —

Titel Title

Reducera energiförbrukning genom bildkompression Reducing Energy Consumption Through Image Compression

Författare Author

Mats Ferdeen

Sammanfattning Abstract

Energikonsumtionen för att skriva och läsa till off-chip minne är ett känt problem. Inom bildbehandlingsområdet struktur från rörelse kan enklare kompressionstekniker användas för att spara energi. En avvägning mellan detekterade features såsom hörn, kanter, etc. och grad av kompression blir då en fråga att utreda. I detta examensarbete har en djupare studie av denna avvägning utförts. Ett antal mer avancerade kompressionsalgoritmer för bearbetning av stillbilder som tex. JPEG används för jämförelse med ett antal utvalda enklare kompressionsalgoritmer. De enklare algoritmerna kan delas in i två kategorier: individuell blockvis kompression av vardera bilden och kompression med hänsyn till samtliga pixlar i vardera bilden. I studien är bildsekvenserna i gråskala och tillhandahållna från en tidigare studie om rullande slutare. Syntetiska data set från ytterligare en studie om ’optical flow’ ingår även för att se hur pass tillförlitliga de andra dataseten är.

The energy consumption to make the off-chip memory writing and readings are a known problem. In the image processing field structure from motion simpler compression techniques could be used to save energy. A balance between the detected features such as corners, edges, etc., and the degree of compression becomes a big issue to investigate. In this thesis a deeper study of this balance are performed. A number of more advanced compression algorithms for processing of still images such as JPEG is used for comparison with a selected number of simpler compression algorithms. The simpler algorithms can be divided into two categories: individual block-wise compression of each image and compression with respect to all pixels in each image. In this study the image sequences are in grayscale and provided from an earlier study about rolling shutters. Synthetic data sets from a further study about optical flow is also included to see how reliable the other data sets are.

Nyckelord

Keywords Energy Consumption, Off-chip Memory, Image Compression, Structure From Motion, fea-ture detection, camera trajectory, area error, block-wise image compression, JPEG, WebP

(6)

(7)

Sammanfattning

Energikonsumtionen för att skriva och läsa till off-chip minne är ett känt pro-blem. Inom bildbehandlingsområdet struktur från rörelse kan enklare kompres-sionstekniker användas för att spara energi. En avvägning mellan detekterade features såsom hörn, kanter, etc. och grad av kompression blir då en fråga att ut-reda. I detta examensarbete har en djupare studie av denna avvägning utförts. Ett antal mer avancerade kompressionsalgoritmer för bearbetning av stillbilder som tex. JPEG används för jämförelse med ett antal utvalda enklare kompressionsalgo-ritmer. De enklare algoritmerna kan delas in i två kategorier: individuell blockvis kompression av vardera bilden och kompression med hänsyn till samtliga pixlar i vardera bilden. I studien är bildsekvenserna i gråskala och tillhandahållna från en tidigare studie om rullande slutare. Syntetiska data set från ytterligare en stu-die om ’optical flow’ ingår även för att se hur pass tillförlitliga de andra dataseten är.

(8)

(9)

Abstract

The energy consumption to make the off-chip memory writing and readings are a known problem. In the image processing field structure from motion simpler compression techniques could be used to save energy. A balance between the detected features such as corners, edges, etc., and the degree of compression be-comes a big issue to investigate. In this thesis a deeper study of this balance are performed. A number of more advanced compression algorithms for processing of still images such as JPEG is used for comparison with a selected number of simpler compression algorithms. The simpler algorithms can be divided into two categories: individual block-wise compression of each image and compression with respect to all pixels in each image. In this study the image sequences are in grayscale and provided from an earlier study about rolling shutters. Synthetic data sets from a further study about optical flow is also included to see how reli-able the other data sets are.

(10)

(11)

Acknowledgments

Jag tackar min familj för allt ert stöd och hopp ni gett mig under mina studieår på universitet.

Linköping, April 2016 Mats Ferdeen

(12)

(13)

Notation

Abbreviations

Abbreviation Meaning

BTC Block Truncation Coding

TC Texture Compression

LSB Least Significant Bit

SFM Structure From Motion

PCB Printed Circuit Board

SRAM Static Random-Access Memory

DRAM Dynamic Random-Access Memory

CPU Control Processing Unit

FPGA Field-Programmable Gate Array

LUT Lookup Table

FF Flip-flop

IO Input & Output

BUFG Global Buffer

DSP48 Digital Signal Processing unit

(16)

(17)

1

Introduction

The power consumption of off-chip memory reading is a big problem when pro-cessing huge amounts of data on longer distances of a PCB, this problem is ad-dressed in [17] and data have been gathered from it in tables 1.1 and 1.2. Table 1.1 shows energy consumption from a couple of different cases of processor chip com-ponents with a 40 or 10 nm transistor size technology. For example, reading 64 bits from an on-chip 8-kbyte static RAM (often used as cache memory) consumes about 14 pJ on a 40nm technology. If the SRAM would be outside of the chip the energy consumed for interfacing, accessing and transition of data through the PCB wires would need to be calculated instead. Only the PCB wire transition consumes 153 pJ. That is at least 10 times more compared to reading from the on-chip SRAM. Since the SRAM are used as a cache for the CPU the problem gets even worse when involving request for data from the DRAM, energy consump-tion for this can be seen in table 1.2. Accessing 64 bits from a DRAM costs at least 36 times more without even including the energy cost for interfacing with the DRAM. Looking at the year of 2017 the energy consumption of accessing or interfacing to a DRAM have dropped by at the very least 4 times or even more. But the energy consumption of wire transition is only about halved. The use of an integrated graphics card with the CPU such as Intel HD and Iris graphics are mod-ern solutions for this, general information about this can be found in [8]. Another approach is to use hardware-controlled compression for data to be transferred be-tween chips. Depending on the scenario, different compression techniques must be considered to use. An interesting scenario is image-compression in structure from motion (SFM). The ability to record movements in a video and recreate ob-jects from it in 3d are an important aspect when recording landscapes from high above in the sky as well as from the ground. For this to be possible a good feature detection is crucial to reconstruct a 3d model out of a set of 2d images. Too lossy image compression may decrease it significantly. The main topic of this thesis are

(18)

4 1 Introduction

2010 2017

Process technology 40nm 10nm, high

frequency

10nm, low

voltage

Vdd (nominal) 0.90 V 0.75 V 0.65 V

Frequency 1.60 GHz 2.50 GHz 2.00 GHz

64-bit read from an 8-kbyte on-chip static (SRAM)

14.0 pJ 2.40 pJ 1.80 pJ

Wire energy (per transition) 240 fJ/bit/mm 150 fJ/bit/mm 115 fJ/bit/mm Wire energy (64 bits, 10 mm) 153 pJ 96 pJ 73 pJ

Table 1.1:Energy consumption of internal SRAM and PCB wires.

2010 2017 DRAM process technology 45nm 16nm DRAM interface pin bandwidth 4 Gb/s 50 Gb/s DRAM interface energy 20-30 pJ/bit 2 pJ/bit

DRAM access en-ergy

8-15 pJ/bit 2.5 pJ/bit

Table 1.2:Energy consumption of external DRAM.

the research of this trade-off.

1.1 Scenario

The scenario for the thesis are illustrated in figure 1.1. A camera records a video of a given scene. From the video a set of still images are obtained. Each still image is divided in 4 × 4 blocks of pixels. All blocks of a still image are sent to a coder. The coder compresses the data in each block. The compressed data are thereafter sent to a memory for storage. When the compressed data are needed by a process it is fetched from the memory. A decoder recreates the data and the content are processed by the CPU. The coder and decoder are implemented in both soft- and hardware. The work flow of the thesis are described in figure 1.2. The incoming data is still images in grayscale format from a video. The next step requires two different approaches:

• The grayscale still images are compressed and decompressed in chosen com-pression algorithms for benchmarking in the scenario.

(19)

1.1 Scenario 5 Camera Coder Memory Decoder CPU Bus

Figure 1.1:The scenario for image compression in structure from motion.

compression (coding) and decompression (decoding) of the grayscale still images. The code is implemented in software. If a proposed algorithm may give good results in the scenario it is implemented in hardware. Since the simulations of hardware are time consuming, it is critical to implement and verify the proposed algorithms in software first.

The still images are taken from a video in grayscale format with a suitable sam-pling interval for the scenario. The frames captured from video are stored in an uncompressed or lossless format such as BMP or PNG. The videos are given by Per-Erik Forssén at ISY Computer Vision for an earlier study in [16] about rolling shutter. The videos are recorded with a Canon s95 camera. A couple of synthetic data sets are also used, they are gathered from another survey [10].

1.1.1 Compression algorithms for benchmarking (Approach A)

Two commonly used compression algorithms are JPEG and JPEG2000 for still images. These are used in software for benchmarking of the hardware. A compar-ison between JPEG and JPEG2000 were carried out in [13]. Another compression algorithm by google are compared with as well, the WebP algorithm. It has both a lossy and a lossless method, both are used for comparison. In this compara-tive study [6], the results show that WebP outperforms both JPEG and JPEG2000. While the image quality measured in PSNR turned out to be similar for all three, the non-negative average compression was 41.30 % of WebP, 22.37 % of JPEG and 27.67 % for JPEG2000. For rendering purposes, the texture compression algorithm Crunch dxt are used for benchmarking.

(20)

6 1 Introduction

Figure 1.2:The workflow of the thesis work.

1.1.2 Proposed algorithms (Approach B)

The main idea here is to split up the still image in blocks of pixels and execute compression individually for each block. For each block a different algorithm may be executed depending on the pixel values in it. The following proposed algorithms for software are:

• Block truncation coding (BTC)

• Erase least significant bits (Erase LSB:s) • Bilinear interpolation from corners • Texture Compression, without alpha (TC) • BTC and bilinear interpolation from corners

• Erase 4 LSB:s and bilinear interpolation from corners • Resize with gamma correction and bilinear interpolation • Monochrome palette with Floyd-Steinberg dithering

1.2 Motivation and purpose

The motivation and purpose of this thesis is to investigate the trade-off between the compression-rate of a set of 2d-images and detected features from them in structure from motion and how the complexity of the hardware implementations correspond.

(21)

1.3 Question and problem formulations 7

1.3 Question and problem formulations

The main questions are stated as such:

• What is the correlation between compression ratio and detected features in a set of images?

• What are the most critical cases in structure for motion to be considered for the scenario?

• Can the compression ratio and number of detected features improve by combining two or more different compression algorithms?

The problem formulations are based upon finding the trade-off between detected features and compression ratio and the hardware implementations:

• Keeping the compression ratio and number of detected features as high as possible.

• The complexity of the hardware vs the software implementation. • The speed of the compression and decompression in hardware.

1.4 Limitations

Following limitations are considered:

• The amount of scenes for each case to simulate in SFM. • The amount of algorithms for compression in approach B.

• The amount of hardware simulations, each simulation is time consuming for system verification compared to simulation in software.

(22)

(23)

2

Earlier work and related research

2.1 Block truncation coding (BTC)

This algorithm is one of the simplest of use in image compression. The compres-sion ratio of BTC is considerably worse compared to JPEG. What makes BTC of interest is its compression speed as mentioned in [14], this advantage is of good use when streaming content through the hardware. A general description of how BCT works can be found in [12]. An example of a block is shown in (2.1), consist-ing of 4 × 4 pixels. 128 128 128 127 129 130 128 129 129 130 128 127 128 129 128 129 (2.1)

To encode this block the equation shown in (2.2) is used, encode a pixel with 1 or 0 depending on the mean value of the whole block it belongs to. The encoded result of the given block can be seen in (2.3). The algorithm calculates neces-sary coefficients for the decoding with the equation in (2.4). σ is the standard deviation of the block, q is the number of pixels greater than the mean value of the block and m is total number of pixels in it. To determine the values of each pixel when decoding the equation in (2.5) is used. The decoded blocks result can be seen in (2.6). The compression ratio of this algorithm ensures a 16:3 (< 4:1) compression. y(i, j) =        1, x(i, j) > ¯x 0, x(i, j) ≤ ¯x (2.2) 9

(24)

10 2 Earlier work and related research 0 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1 (2.3) a = ¯x − σq_m−qq b = ¯x + σqm−q_q (2.4) x(i, j) =        a, y(i, j) = 0 b, y(i, j) = 1 (2.5) 127 127 127 127 128 128 127 128 128 128 127 127 127 128 127 128 (2.6)

2.2 Monochrome palette

By using a simple approach such as a monochrome palette, the number of bits to represent a pixel could be greatly reduced. Older systems like Gameboy from Nintendo used 4 shades of gray [4], a two bits’ monochrome palette. By using a palette of at least 3 or 4 bits for each pixel per block the compression ratios has the value of 16:3 or 16:4. The compression ratio is then equal or less bad compared with BTC.

2.3 Bilinear interpolation

For general information about bilinear interpolation see [1]. An example of bi-linear interpolation is shown in figure 2.1. It first calculates the interpolation horizontally for the two pairs of cornered pixels. These two calculated interpola-tion points are then used to interpolate diagonally at the given pixel for bilinear interpolation, hence the name of the method. The interpolation problem can be seen in (2.7), the system for this can be lined up as equation (2.8). As mentioned in [20] the speed of bilinear interpolation compared to bicubic interpolation is greater which makes it more suitable for streaming of data.

f (x, y) ≈ b11f (Q11) + b12f (Q12) + b21f (Q21) + b22f (Q22) (2.7)             b11 b12 b21 b22             =                          1 x1 y1 x1y1 1 x1 y2 x1y2 1 x2 y1 x2y1 1 x2 y2 x2y2             −₁             T _            1 x y xy             (2.8)

(25)

2.4 Texture Compression (TC) 11

(1,1) (4,1)

(4,4) (1,4)

(2,2)

Figure 2.1:An example of bilinear interpolation of a 4 × 4 block of pixels.

2.4 Texture Compression (TC)

The general info about Texture Compression can be found in [3]. The main idea of TC is to create a dynamic monochrome palette for each block of pixels. To determine the monochrome palette for each block the two extreme pixel values (maximum and minimum) from it are used. By interpolating two colors between the extremes a monochrome palette is created. This is the main advantage of TC. However, as mentioned in [21] the algorithm to find a line through color space is expensive. TC is therefore not an ideal algorithm to be used for streaming of image data. As it may lack the speed it is still of interest in the thesis since textures are important to save for the detection of features in an image.

2.5 Structure from motion (SFM)

Structure from motion is a ranging technique with the ability to create 3d struc-ture from a set of 2d figures in a motion field of a moving object or scene. A couple of terminologies regarding SFM are further explained here.

(26)

12 2 Earlier work and related research

2.5.1 Features and detectors

In image processing features is addressed as certain patterns in an image such as corners, edges or blobs. There are numerous ways to find these with feature detectors, general information about feature detection can be found in [9]. A couple of examples are:

• The edge detector Canny edge detector [11].

• The edge and corner detector Harris & Stephens [15]. • The corner and blob detector FAST [19].

• Based upon the corner and blob detector Difference of Gaussians (DoG), the scale-invariant feature transform (SIFT) [18]. A very popular feature detector for SFM, this is the one used in the thesis.

2.5.2 Feature trajectories

After the feature detection of a pair of images the trajectories between features they share are calculated. These trajectories are then used to calculate the motion of the camera and its relative position.

2.5.3 Camera trajectories

The camera trajectories are the final data output from SFM from a set of 2d-images. Comparing the camera trajectories of a set of compressed 2d-images with the raw (uncompressed) are the method used in this thesis to find correla-tion between detected features and compression ratio of images.

2.5.4 Area error

By comparing the cameras trajectories between a set of compressed and uncom-pressed 2d-images the area error is calculated. In figure 2.2 a simple example are shown. The triangular areas between the camera trajectories are calculated and summarized, also called an area error. The method to use area error for compari-son is also used and further explained in [16].

2.6 Ground truth data from synthetic video frames

Using the raw data (uncompressed) of a set of 2d-images for SFM may be the best output to measure the area error for a compressed set of 2d-images. However, synthetic videos are another way to find how well the raw data set actually may be used for comparison against other sets of 2d-images as ground truth data. An earlier work about synthetic videos can be found in [10].

(27)

2.7 Gaussian noise 13

Raw camera trajectory

Test camera trajectory R1

R2 R3 R4

T1

T2 T3

T4

Figure 2.2:The area estimation between the camera trajectories.

2.7 Gaussian noise

Gaussian noise in images are introduced when poor illumination or high tem-peratures exists in the environment the image was taken. Also another factor is electronic circuit noise from transmissions inside the camera. Different levels of Gaussian noise will be used on the set of raw-images as a measure of how close its area errors are compared to the compressed set of 2d-images. This is adjusted by changing the standard deviation of the Gaussian noise.

(28)

(29)

3

Methodology

3.1 Implementation

The feature detection needs to be taken into consideration for the image compres-sion algorithms. Also, the ability to stream content at a decent speed through the compression and decompression hardware is of importance. This divides the al-gorithms into two different groups. The first group is the compression alal-gorithms that can split an image into small blocks of n × n pixels and perform compression on each of these independently of another. The second group is algorithms that compresses a frame (whole image) at a time. Both methods are spatial image compression. When integrating the hardware parts with an existing platform ex-ternal issues needs to be taken into consideration. The hardware solutions are synthesized against a Field-Programmable Gate Array (FPGA).

3.1.1 Limitations

Choice of data sets for the given scenario

It is uncertain what kind of data sets that would give good simulation results and correlation between the compression rates of each algorithm and detected features.

Choice of compression algorithms for the given scenario

It is uncertain which ones will give good results and how well they perform for the given data sets.

(30)

16 3 Methodology

Floating points in hardware implementation

Floating point operations are complex to implement in hardware. This leads to fixed point arithmetic operations instead. Quantization errors must then be taken into consideration for calculations.

Square root in hardware implementation

Some of the more complex compression algorithms such as Texture Compression (TC) require to calculate min and max distance in each block of pixels. In this thesis survey the data sets are grayscale and a simple divide and conquer algo-rithm can be used to find min and max in one dimension. But if it would be colored images then the Euclidean distance must be calculated using a square root in hardware implementation.

Compression speed for TC

TC lacks the speed to compress data.

3.1.2 Block-wise image compression

These are simple algorithms to be executed on small blocks of pixels in an image, making it possible to stream content from example a camera. Since each block are processed individually this may create great artifacts and noise.

Block truncation coding (BTC)

For the software implementation, see the earlier work chapter about BTC. The key to BTC is to calculate the a and b coefficients to find the best two pixels of use from the block. The mean value of the block decides which of these two coeffi-cients a pixel will be replaced with. The calculation of the mean value are shown in figure 3.1a. As can be seen a rounding error occurs. The calculation of q and

σ are dependent of the mean value. Figure 3.1b gives a simpler approach to find

the deviation σ for the block. By calculating the biggest and smallest difference among all pixels in the block compared with the mean value an arbitrary value for the deviation σ can be done. Coefficients a and b are in figure 3.1c calculated

by using a lookup table. Figure 3.1d shows how y0-y15are coded from the pixels

x0-x15by comparison with the mean value. The coding stage for BTC are shown

in figure 3.1 and the decoding stage in figure 3.2.

Erase least significant bits

(31)

3.1 Implementation 17 + + + x0 >> >> REG x x1 x2 x15 8 8 8 8 8 8 8 8 8

(a)Calculation of the mean value.

Combinatorial net 8 8 8 x x0 x15 REG REG REG max_diff min_diff q + max_diff min_diff >> REG σ 8 8 8 8 8 8 8 8

(b)Calculation of the standard deviation.

Combinatorial net m q σ REG a REG b 8 8 8 ₈ 8 x 8

(c)Calculation of the coefficients a and b.

Combinatorial net x x0 REG y0 REG y15 8 8 8 x15 (d)Calculation of y0-y15.

(32)

18 3 Methodology Combinatorial net a y0 REG x0 REG x15 8 8 8 y15 b 8

Figure 3.2:Decoding stage of BTC. Consists of a simple lookup table.

Combinatorial net x0 x15 y0 y3 8 8 8 8

Figure 3.3: Coding stage of bilinear interpolation from corners, save only

the four corner pixels of the block.

Bilinear interpolation from corners

The compression algorithm with bilinear interpolation save only the four pixels in the corners of the block in the compression stage. The four pixels are there-after sent to the decompressor for recreation of the block by carrying out bilinear interpolation from the corners. Figure 3.3 shows the coding stage, it saves only the corner pixels from the block. The more complex part here is the decoder in figure 3.4. It needs to execute the bilinear interpolation from the four corners, all pixels are calculated in parallel. What is of importance here is to sign extend all bit operations due to negative numbers. The weighting of the interpolation can be done by multiplying with a set of fractional bits.

Texture Compression, without alpha (TC)

A Texture Compression without alpha (transparency) for the still images, as dis-cussed in the earlier work chapter about TC. Since the input data are grayscale the TC is easy to execute. The coder is shown in figure 3.5 and figure 3.6. In the coding stage the task is to find the two extremes (minimum and maximum) of the block of pixels as can be seen in figure 3.5a and figure 3.5b. Thereafter the ex-tremes are used to interpolate a lookup table and code each pixel according to the table as seen in figure 3.6a and used to code each pixel figure 3.6b. Decode stage

(33)

3.1 Implementation 19 Combinatorial net Interpolated pixel at position (x=2,y=1) pixel_1_1 8 8 8 pixel_1_4 8 pixel_4_1 8 pixel_4_4 4 position_x= 2 position_y= 1 4 Combinatorial net pixel_1_1 8 8 8 pixel_1_4 8 pixel_4_1 8 pixel_4_4 4 position_x= 3 position_y= 4 4 Interpolated pixel at position (x=3,y=4)

Figure 3.4:Decoding stage of bilinear interpolation from corners. Calculate

(34)

20 3 Methodology

are shown in figure 3.7. Both code and decode stage needs to do interpolation to create the lookup table.

Block truncation coding and bilinear interpolation from corners

This compression consists of the compression algorithm BTC and bilinear inter-polation. To decide which algorithm to use for a block the standard deviation of it are used for comparison with a given threshold. Since the deviation is equal to the square root of the variance it is easier to think of the latter when determining the corresponding value of the block. If the variance is low in the block, then the compression with bilinear interpolation will be used otherwise do BTC. The variance of blocks consisting of 4 × 4 pixels may very likely be low in overall so the threshold needs to adjusted to a reasonable level according to this.

Erase LSB:s and bilinear interpolation from corners

Same as in block truncation coding and bilinear interpolation from corners ex-cept that it uses erase LSB bits instead of BTC.

3.1.3 Frame-wise image compression

These algorithms take a whole frame at a time, the ability to stream content are slower with this approach compared to block-wise. However, at this cost of speed it may create less artifacts and noise in the images since it can sweep through the whole image and then take into account all neighboring pixels.

Resize with gamma correction and bilinear interpolation

A simple approach is to resize frames and use the resized for the scenario. But as the image are resized the gamma of it may be incorrect depending on the software of use. A survey about this is done in [5]. The author of the survey mentions that many free and cheap software programs for resizing of images are not following the correct formula for gamma correction. For general information about gamma correction see [2, 7].

Monochrome palette with Floyd-Steinberg dithering

A more complex monochrome palette are surveyed as well with dithering.

3.2 Evaluation

The scenario structure from motion use the program VisualSFM from [22, 23, 24, 25], it calculates camera trajectories for each image sequence. All compressed image sequences calculated camera trajectories are compared with the raw image sequence to calculate the area error. The trajectories are also used for compar-ison with Gaussian noise in the raw image sequence to understand how much

(35)

3.2 Evaluation 21 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 1 Max 8 8 Max 8 8 Max 88 x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 pixel_max

(a)Calculate max value of 16 bytes.

1 Max 8 8 Max 8 8 1 Max 8 8 Min 8 8 1 Max 8 8 Min 8 8 1 Max 8 8 Min 8 8 1 Max 8 8 Min 8 8 1 Max 8 8 Min 8 8 1 Max 8 8 Min 8 8 1 Max 8 8 Min 8 8 Min 1 Max 8 8 Max 8 8 Min 1 Max 8 8 Max 8 8 Min 1 Max 8 8 Max 8 8 Min 1 Max 8 8 Max 8 8 Min 1 Max 8 8 Max 8 8 Min 1 Max 8 8 Max 8 8 Min 1 Max 8 8 Max 8 8 Min 88 x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 pixel_min

(b)Calculate min value of 16 bytes.

Figure 3.5:Coding stage of TC without alpha, first calculate the maximum

(36)

22 3 Methodology + << pixel_min pixel_max * 8 8 8 6 010101 interpolated_ pixel_close_to_min + << pixel_max pixel_min * 8 8 8 6 010101 interpolated_pixel_close_to_max

(a)Interpolation of the two values between extremes of the block.

Code pixel 8 8 8 8 8 x0 pixel_min interpolated_ pixel_close_to_min interpolated_ pixel_close_to_max pixel_max Code pixel 8 8 8 8 8 x16 pixel_min interpolated_ pixel_close_to_min interpolated_ pixel_close_to_max pixel_max dxt_coded_pixel_x0 dxt_coded_pixel_x16

(b)TC coding the pixels.

Figure 3.6: Coding stage of TC without alpha, lastly calculate the two digit

(37)

3.2 Evaluation 23 Combinatorial net y0 REG x0 REG x15 8 8 8 y15 8 pixel_min pixel_max 2 2

Figure 3.7:Decoding stage of TC without alpha. Consists of a simple lookup

table.

noise the compression may correspond to. Area error ae within the limit of 0 < ae < 0.05m2gives reasonable results in VisualSFM.

Standard deviation σ of the Gaussian noise within the limit 3 < σ < 18 corre-sponds to photographing in darker environments. This can be interpreted as

grayscale levels as σ = 255√v, where v is the variance. A group of varying

Gaus-sian noise of an image can be seen in figure 3.8. The interval of variance applied for the data sets are [0.0005, 0.0044].

3.2.1 Extraction of frames from video with ffmpeg

The data sets of use are a large set of different videos given by Per-Erik Forssén at ISY. The videos are recorded with a Canon s95 camera at a capture rate of 24 frames per second. The image sequences are extracted from these videos. An image sequence consisting of at least 60 frames are necessary for VisualSFM to generate 3d models. The data sets are sampled into image sequences with

differ-ent sampling rates and time intervals inffmpeg in order to achieve this.

3.2.2 VisualSFM

A Pairwise matching of all images with each other are used to create the SIFT:s. A fixed calibration are used, the parameters are gathered from the intrinsic pa-rameters of the camera. In figure 3.9 an example of a 3d reconstruction can be seen. A line consisting of camera positions at the scene are calculated, also called a camera trajectory.

3.2.3 The data sets

The data sets are gathered from another survey [16]. The total amount of data sets are 15 different videos. All the chosen data sets are recorded with a Canon s95 camera. A general overview is given of each set and what kind of case they may be for the scenario structure from motion. The naming of the videos is taken from the source:

(38)

24 3 Methodology

(a)A low variance of Gaussian noise (var = 0.0011). PSNR=29.69dB

(b) A medium variance of Gaussian noise (var = 0.0088). PSNR=20.74dB

(c)A high variance of Gaussian noise (var = 0.0352). PSNR=15.17dB

(39)

3.2 Evaluation 25

Figure 3.9:An example of a 3d reconstruction in VisualSFM.

• MOV01 • MOV07 • MOV08 • MOV09 • MOV13 • MOV14 • MOV15 • MOV16 • MOV20 • MOV21 • MOV22 • MOV23 • MOV29 • MOV30

A couple of synthetic data sets are also used, they are gathered from another survey [10]. Each data set has unique calibration settings:

• MOV_alley_2 • MOV_mountain_1 • MOV_temple_2

(40)

26 3 Methodology

The following compressions were run for the movies captured with the Canon s95 camera and the synthetic data sets:

• BTC

• Bilinear interpolation

• TC, without alpha (block size 8x8 only for MOV01, MOV_mountain_1, MOV14, MOV15 and MOV23)

• Resize

• Monochrome palette • Erase LSB:s

• Bilinear interpolation with erase 4 LSB:s (only for MOV01, MOV13, MOV14, MOV15 and MOV23)

• Bilinear interpolation with BTC (only for MOV01, MOV13, MOV14, MOV15 and MOV23)

• Jpeg • Jpeg2000 • Webp

(41)

4

Results

4.1 Implementation

The compression algorithms chosen for hardware implementation were: • Bilinear interpolation from corners

• Block truncation coding

• Texture Compression, without alpha (TC)

The post-synthesis results of the used hardware resources are seen in figure 4.1 and post-implementation in figure 4.2. The max frequency of each hardware solution are seen in figure 4.3. The target was an Kintex-7 XC7K70T FPGA from Xilinx for the synthesis.

LUT FF Memory LUT IO BUFG DSP48

BTC decoder 64 320 0 161 1 0 BTC coder 1505 320 0 161 12 0 bilinear decoder 435 320 0 161 1 0 bilinear coder 32 64 32 65 1 0 TC decoder 128 352 0 177 1 2 TC coder 1746 352 0 177 1 2

Table 4.1:Post-synthesis results

(42)

28 4 Results

LUT FF Memory LUT IO BUFG DSP48

BTC decoder 64 320 0 161 1 0 BTC coder 1505 320 0 161 12 0 bilinear decoder 435 320 0 161 1 0 bilinear coder 16 64 16 65 1 0 TC decoder 128 352 0 177 1 2 TC coder 1770 352 0 177 1 2

Table 4.2:Post-implementation results

Max frequency (MHz) BTC decoder 133.333 BTC coder 500.000 bilinear decoder 181.818 bilinear coder 500.000 TC decoder 181.818 TC coder 58.824

Table 4.3:Post-implementation, maximum frequency

4.2 Evaluation

The remaining movies captured with the canon s95 camera and the synthetic data sets were simulated for evaluation. Some of the compressed data sets failed in the simulations and cannot be seen in the results. The failed data sets could be categorized as follow:

• Faulty ground truth, VisualSFM did not calculate a good enough camera trajectory of raw image sequence.

• Missed camera position’s in the test trajectories.

• Faulty test trajectory, wrong calculated trajectory in VisualSFM. • A faulty ground truth and test trajectory, hard to recognize.

Some of the mentioned cases are shown in figure 4.1. The figures figure 4.1a, figure 4.1b and figure 4.1c are showing the more expected behavior for compar-ison of camera trajectories, it is only the test trajectory that may be a bit wrong calculated but the ground truths trajectory seems to be a fine continuous shape. However, in figure 4.1d a much worse case is shown. Here both the ground truths and the tests trajectories are faulty but the area error is even less than 100. The scaling of the area error is with a multiple of 10000 and as stated earlier the area error should be less than 0 < ae < 0.05m2to have usable camera trajectories. Due to this issues the simulation results may be interpreted as passed or failed, where the passed are of an area error less than ae < 0.05 (0 < ae < 500 in the graphs). It is only MOV01 and a synthetic movie the simulation results will be presented here. The whole set of simulation results can be seen in appendix A.

(43)

4.2 Evaluation 29

(a)A good camera trajectory compar-ison for MOV01.

(b) Slightly worse camera trajectory comparison for MOV01.

(c)A bad camera trajectory compari-son for MOV01.

(d) A faulty camera trajectory com-parison for MOV08.

Figure 4.1:Four different camera trajectory comparisons.

4.2.1 MOV01

Figure 4.2:Some pictures from video MOV01.

Simulation results of the compression algorithms for benchmarking can be seen in figure 4.3 and the self-made in figure 4.4. The failed compressions are also mentioned in the graphs. MOV01 is a very easy scene to reconstruct a 3d model of, it is many images in the sequence and the scene itself do not consist of many complex patterns for the feature detection. Some of the images of MOV01 can be seen in figure 4.2. The solid lines appearing in some graphs represent failed data sets for VisualSFM to render a 3d model of.

(44)

30 4 Results 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

Average compression rate (log)

jpeg failed q_5 failed q_35 failed q_55 q_15 q_25 q_45 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

jpeg2k failed r_70 failed r_65 failed r_60 failed r_20 failed r_15 r_55 r_45 r_40 r_35 r_30 r_25 r_10 r_5 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

webp q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101

crunch_normal failed q_255 q_5 q_35 q_65 q_95 q_125 q_155 q_185 q_215 q_245 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_superfast failed q_95 q_5 q_35 q_65 q_125 q_155 q_185 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber q_5 q_35 q_65 q_95 q_125 q_155 q_185 q_215 q_245 q_255

Figure 4.3: Simulation results from MOV01 of the compression algorithms

(45)

4.2 Evaluation 31

0 100 200 300 400 500

Area error (10−4_[_m2_])

100

101

bilinear, BTC, resize and TC

failed BTC failed resize bilinear TC blocksize 4x4 TC blocksize 8x8 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6

Number of erased lsb bits

erase_lsb_bits erased lsb bits 0 100 200 300 400 500 Area error (10−4 [m2]) 0 1 2 3 4 5 6 7 8

Number of used bits

monochrome_palette

failed

used bits for monochrome palette

0 100 200 300 400 500 Area error (10−4 [m2]) 0.000 0.001 0.002 0.003 0.004 0.005 Variance gaussian_noise failed level of gaussian noise

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

Threshold, std value of block

bilinear_btc

failed Used std for threshold

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_erase

(46)

32 4 Results

4.2.2 MOV08 and MOV09

Most of the image sequences failed for MOV08 and MOV09, figure 4.5 and fig-ure 4.6 show some of the images. The image sequences were very complex for VisualSFM to do feature detection on. Due to this the simulation results are not shown here.

4.2.3 MOV_mountain_1

Figure 4.7:Some pictures from video MOV_mountain_1.

This is one the synthetic data sets, some images of it can be seen in figure 4.7. It has an easier scene for feature detection, the simulation results of the compres-sion algorithms for benchmarking can be seen in figure 4.8 and the self-made in figure 4.9.

(47)

4.2 Evaluation 33 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

jpeg2k r_70 r_65 r_60 r_55 r_45 r_40 r_35 r_30 r_25 r_20 r_15 r_10 r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

webp q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_95 q_100

Figure 4.8: Simulation results from MOV_mountain_1 of the compression

(48)

34 4 Results

0 100 200 300 400 500

100

101

failed resize bilinear TC blocksize 4x4 btc 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6

erase_lsb_bits erased lsb bits 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6 7 8

Number of used bits

monochrome_palette

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0.000 0.001 0.002 0.003 0.004 0.005 Variance gaussian_noise

level of gaussian noise

Figure 4.9: Simulation results from MOV_mountain_1 of the self-made

(49)

5

Discussion

5.1 Results

The data sets MOV01, MOV13 and MOV14 have many passed simulation results with an area error ae < 0.05 while the data sets MOV08 and MOV09 more or less all simulation results had failed. Scenes with non-complex patterns could be one of the reasons behind this. Among the synthetic data sets, the only one with low area errors were MOV_mountain_1, in the scene the camera is hovering between some mountains and slowly moves forward. The lack of images for the synthetic image sequences would be a reason that the rest had many high area errors.

It seems that block-wise compression of the images does not introduce any major area errors compared to frame-wise with resize and monochrome palette. The self-proposed compression algorithms area errors are very low for all the three data sets MOV01, MOV13 and MOV14. The same goes for the compression algorithms for benchmarking as well for different levels of Gaussian noise of the corresponding data sets.

In the implementation the resource utilization of the TC and BTC coder are much greater compared to bilinear interpolation as can be expected. The compar-ison is reversed for the decoders. The max frequency can be further discussed, improvising the hardware implementation of the BTC decoder may be equal to the rest but looking at the TC coders max frequency a big problem is seen. The TC coder suffers from slow processing of the data. Two major issues are to find the extreme values min and max and interpolate two values between these.

5.2 Method

The chosen tool VisualSFM demand a lot of computer resources. The simula-tion times could be several hours per image sequence for the feature detecsimula-tion

(50)

36 5 Discussion

and comparison between images. It was also uncertain what outcome each data set would give when compressed and sent to VisualSFM. A thumb rule was fol-lowed when choosing easier or harder image sequences for the tool VisualSFM. Slow camera motions alongside a couple of objects was preferred, this however required a couple of retries and different recorded scenes to understand. Correct calibration settings were mandatory to build the 3d-models and acquire complete set of camera trajectories.

5.3 Future work

Simulate colored picture sequences would be the next step taken. The self-proposed algorithms TC without alpha, BTC and bilinear interpolation from corners needs to be redone for this. The compression (coding) Speed of the TC would suffer from finding the Euclidean min and max distance in a block of pixels using red, blue and green color channels. BTC is mainly done for grayscale images and not color, however the same method might still work if each color channel are BTC coded separately. Bilinear interpolation from corners will need to take each color channel into the calculation to interpolate pixels.

(51)

6

Conclusions

What is the correlation between compression ratio and detected features in a set of images?

The detected features in a data set could be expressed with the measured area error, however this will not take into account the specific amount of features a compressed data set have compared to the corresponding raw image data set. It is hard to perform measures in terms of detected features as this would require to compare each compressed image with the corresponding raw image positions and types of features (corners, edges, etc.). Referring to simulation results of MOV08 and MOV09 a possible answer to the first question are compression algorithms that introduce sharper contours, edges, etc. in images may give low area errors for very detailed image sequences and imply that the list of detected features are equal between compressed and raw. The compression ratio is dependent on the number of bits erased.

What are most critical cases in structure for motion to be considered for the scenario?

The most critical cases in structure from motion based on the results were im-age sequences with many details and generic or flat surfaces.

Can the compression ratio and number of detected features improve by combining two or more different compression algorithms?

The data sets MOV01, MOV13, MOV14, MOV15 and MOV23 have bilinear in-terpolation from corners combined with BTC or erase LSB bits in the simulation results. In all of the five data sets it does not show any improvements compared to the rest of the compression algorithms. Neither do they introduce any major area errors, so they may still be useful for the given scenario. Since the five data sets turned out to have very low area errors for most compression algorithms other data sets with more details in the image sequences may be needed to carry

(52)

38 6 Conclusions

out the simulations on.

Other conclusions besides answer the main questions arises as well:

• Using grayscale image sequences was a simple way to see what outcome it would have in structure from motion.

• Using correct calibration settings for VisualSFM are mandatory to build the 3d-models and create complete set of camera trajectories.

• The choice of data sets is highly dependent on the amount of details in the image sequences.

• Compression algorithms like BTC are mainly for grayscale images and not colored ones. This may have given an advantage for it compared to the other compression algorithms in this thesis survey.

(53)

(54)

(55)

A

The Simulation results

(56)

42 A The Simulation results 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg failed q_5 failed q_35 failed q_55 q_15 q_25 q_45 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg2k failed r_70 failed r_65 failed r_60 failed r_20 failed r_15 r_55 r_45 r_40 r_35 r_30 r_25 r_10 r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

webp q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

Figure A.1: Simulation results from MOV01 of the compression algorithms

(57)

43

0 100 200 300 400 500

Area error (10−4 [m2])

100

101

failed BTC failed resize bilinear TC blocksize 4x4 TC blocksize 8x8 0 100 200 300 400 500 Area error (10−4 [m2]) 0 1 2 3 4 5 6

Number of used bits

monochrome_palette

failed

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_btc

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_erase

(58)

jpeg failed q_95 q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg2k failed r_45 r_70 r_65 r_60 r_55 r_40 r_35 r_30 r_25 r_20 r_15 r_10 r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

crunch_normal failed q_95 failed q_155 q_5 q_35 q_65 q_125 q_185 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

(59)

45

0 100 200 300 400 500

100

101

bilinear TC blocksize 4x4 btc resize 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6

Number of used bits

monochrome_palette

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0.000 0.001 0.002 0.003 0.004 0.005 Variance gaussian_noise failed level of gaussian noise

(60)

jpeg failed q_5 failed q_25 failed q_55 q_15 q_35 q_45 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg2k failed r_70 failed r_65 failed r_60 failed r_55 failed r_45 failed r_40 failed r_30 failed r_25 failed r_20 r_35 r_15 r_10 r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

webp failed q_45 failed q_55 failed q_75 q_5 q_15 q_25 q_35 q_65 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_normal failed q_35 failed q_65 failed q_95 failed q_155 failed q_185 failed q_215 failed q_245 q_5 q_125 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_superfast failed q_5 failed q_35 failed q_65 failed q_95 failed q_125 failed q_155 failed q_185 failed q_245 q_215 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber failed q_5 failed q_65 failed q_95 failed q_125 failed q_155 failed q_185 failed q_215 failed q_245 q_35 q_255

(61)

47

0 100 200 300 400 500

100

101

failed bilinear failed TC blocksize 4x4 failed BTC failed resize 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6

erase_lsb_bits failed erased lsb bits 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6 7 8

Number of used bits

monochrome_palette

failed

(62)

jpeg failed q_5 failed q_15 failed q_25 failed q_35 failed q_45 failed q_55 failed q_65 failed q_75 failed q_85 failed q_95 failed q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg2k failed r_70 failed r_65 failed r_60 failed r_55 failed r_45 failed r_40 failed r_35 failed r_30 failed r_25 failed r_20 failed r_15 failed r_10 failed r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

webp failed q_5 failed q_15 failed q_25 failed q_35 failed q_45 failed q_55 failed q_65 failed q_75 failed q_85 failed q_95 failed q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_normal failed q_5 failed q_35 failed q_65 failed q_95 failed q_125 failed q_155 failed q_185 failed q_215 failed q_245 failed q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_superfast failed q_5 failed q_35 failed q_65 failed q_95 failed q_125 failed q_155 failed q_185 failed q_215 failed q_245 failed q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber failed q_5 failed q_35 failed q_65 failed q_95 failed q_125 failed q_155 failed q_185 failed q_215 failed q_245 failed q_255

(63)

49

0 100 200 300 400 500

100

101

failed bilinear failed TC blocksize 4x4 failed BTC failed resize 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6

erase_lsb_bits failed 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6 7 8

Number of used bits

monochrome_palette failed 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0.000 0.001 0.002 0.003 0.004 0.005 Variance gaussian_noise failed

(64)

jpeg q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg2k failed r_40 failed r_20 r_70 r_65 r_60 r_55 r_45 r_35 r_30 r_25 r_15 r_10 r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

crunch_normal q_5 q_35 q_65 q_95 q_125 q_155 q_185 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber failed q_65 failed q_95 failed q_245 failed q_255 q_5 q_35 q_125 q_155 q_185 q_215

(65)

51

0 100 200 300 400 500

100

101

bilinear TC blocksize 4x4 btc resize 0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 1 2 3 4 5 6

Number of used bits

monochrome_palette

(66)

jpeg failed q_25 failed q_35 q_5 q_15 q_45 q_55 q_65 q_75 q_85 q_95 q_100 0 100 200 300 400 500 Area error (10−4 [m2]) 100 101 102

jpeg2k failed r_10 r_70 r_65 r_60 r_55 r_45 r_40 r_35 r_30 r_25 r_20 r_15 r_5 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101 102

webp failed q_55 failed q_95 q_5 q_15 q_25 q_35 q_45 q_65 q_75 q_85 q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_normal q_5 q_35 q_65 q_95 q_125 q_155 q_185 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_superfast failed q_35 failed q_125 q_5 q_65 q_95 q_155 q_185 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber failed q_65 failed q_155 q_5 q_35 q_95 q_125 q_185 q_215 q_245 q_255

Figure A.11:Simulation results from MOV13 of the compression algorithms

(67)

53

0 100 200 300 400 500

100

101

failed TC blocksize 4x4 failed resize bilinear btc 0 100 200 300 400 500 Area error (10−4 [m2]) 0 1 2 3 4 5 6

erase_lsb_bits failed 0 100 200 300 400 500 Area error (10−4 [m2]) 0 1 2 3 4 5 6 7 8

Number of used bits

monochrome_palette

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_btc

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_erase

(68)

crunch_superfast q_5 q_35 q_65 q_95 q_125 q_155 q_185 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber failed q_125 failed q_155 q_5 q_35 q_65 q_95 q_185 q_215 q_245 q_255

(69)

55

0 100 200 300 400 500

100

101

bilinear TC blocksize 4x4 btc resize 0 100 200 300 400 500 Area error (10−4 [m2]) 0 1 2 3 4 5 6

Number of used bits

monochrome_palette

failed

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_btc

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_erase

(70)

webp failed q_5 q_15 q_25 q_35 q_45 q_55 q_65 q_75 q_85 q_100 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_normal failed q_5 failed q_65 failed q_185 q_35 q_95 q_125 q_155 q_215 q_245 q_255 0 100 200 300 400 500 Area error (10−4_[_m2_]) 100 101

crunch_uber failed q_125 q_5 q_35 q_65 q_95 q_155 q_185 q_215 q_245 q_255

(71)

57

0 100 200 300 400 500

100

101

bilinear TC blocksize 4x4 btc 0 100 200 300 400 500 Area error (10−4 [m2]) 0 1 2 3 4 5 6

Number of used bits

monochrome_palette

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_btc

0 100 200 300 400 500 Area error (10−4_[_m2_]) 0 5 10 15 20

bilinear_erase

Reducing Energy Consumption Through Image Compression

Institutionen för systemteknik

Department of Electrical Engineering

Examensarbete

Reducing Energy Consumption Through Image

Compression

Reducing Energy Consumption Through Image

Compression

Examensarbete utfört i Datateknik

vid Tekniska högskolan vid Linköpings universitet

av

Sammanfattning

Abstract

Acknowledgments

Contents

Notation

1

Introduction

1.1

Scenario

1.1.1

Compression algorithms for benchmarking (Approach A)

1.1.2

Proposed algorithms (Approach B)

1.2

Motivation and purpose

1.3

Question and problem formulations

1.4

Limitations

2

Earlier work and related research

2.1

Block truncation coding (BTC)

2.2

Monochrome palette

2.3

Bilinear interpolation

2.4

Texture Compression (TC)

2.5

Structure from motion (SFM)

2.5.1

Features and detectors

2.5.2

Feature trajectories

2.5.3

Camera trajectories

2.5.4

Area error

2.6

Ground truth data from synthetic video frames

2.7

Gaussian noise

3

Methodology

3.1

Implementation

3.1.1

Limitations

3.1.2

Block-wise image compression

3.1.3

Frame-wise image compression

3.2

Evaluation

3.2.1

Extraction of frames from video with ffmpeg

3.2.2

VisualSFM

3.2.3

The data sets

4

Results

4.1

Implementation

4.2

Evaluation

4.2.1

MOV01