Land-Cover Mapping in Stockholm Using Fusion of ALOS PALSAR

(1)

Land-Cover Mapping in Stockholm Using Fusion of ALOS PALSAR

and SPOT Data

Johan Wallin

Master’s of Science Thesis in Geoinformatics TRITA-GIT EX 08-012

School of Architecture and the Built Environment Royal Institute of Technology (KTH)

100 44 Stockholm, Sweden

October 2008

(2)

TRITA-GIT EX 08-012 ISSN 1653-5227 ISRN KTH/GIT/EX--08/012-SE

(3)

Land-Cover Mapping in Stockholm Using Fusion of ALOS PALSAR and SPOT Data

Johan Wallin

October 2008

(4)

(5)

Acknowledgement

I would like to express my gratitude towards my supervisor Yifang Ban for her support throughout this work.

This research was support by the Swedish Research Council for Envi-

ronment, Agricultural Sciences and Spatial Planning (FORMAS) awarded

to my supervisor Professor Yifang Ban.

(6)

(7)

Abstract

The objective of this research was to investigate the capabilities for land- cover classification using fusion of SAR data from the PALSAR sensor and optical data from the SPOT sensor using a hierarchical approach.

The Study area was Stockholm, Sweden. Two dual polarization PAL- SAR images and one multispectral SPOT HRG image acquired summer 2007 was used. The data was classified in two levels. First the images were separated into four classes (Water, Forest, Urban and Open Area) with an artificial neural network (ANN) classifier. In the second step, these classes were refined by a hybrid classifier to Water, Forest, Low Density Built-up, High Density Built-up, Road, Park and Open Field.

As some areas in the optical image were covered by clouds, a hierarchical classification using only PALSAR was made. This classification was used to fill in for “information gaps” in the joint classification of SPOT and PALSAR.

The result from the hierarchical classifier shows an overall accuracy in- crease with more than 10% compared to an ordinary ANN-classifier (from 75,4% to 87,6%). The accuracy of all land cover classes increased except for the Low Density Built-up, where the two classifiers had approximately the same result.

For testing the capabilities of PALSAR for Land cover classification, two reference classifications using only ANN where created. The comparison of those two land cover maps shows an overall accuracy increases when including PALSAR data compared to only using optical data. Especially the accuracy of the classes Forest and Open Field increased; forest from 87,6% to 94,0% and Open Field from 34,1% to 72,3%.

The research shows that PALSAR data to some degree can be used to

improve the land cover classification in urban areas, and the hierarchical

approach increases the classification accuracy compared to pixel-based clas-

sification.

(8)

(9)

Acknowledgement iii

Abstract v

1 Introduction 1

1.1 Rationale for the Research . . . . 1

1.2 Research Objectives . . . . 3

2 Literature Review 5 2.1 Effects of SAR System Parameters on Urban Land-Cover Classification . . . . 5

2.1.1 Frequency . . . . 5

2.1.2 Polarization . . . . 6

2.1.3 Incidence Angle . . . . 7

2.1.4 Summary . . . . 8

2.2 Analysis Methods . . . . 8

2.2.1 Texture . . . . 8

2.2.2 Speckle Filtering . . . . 9

2.3 Optical data for urban land-cover mapping . . . . 9

2.4 Fusion of SAR and Optical Data . . . . 10

2.5 Image Classification . . . . 11

2.5.1 ANN Classifier . . . . 11

2.5.2 Object-Based Classification . . . . 11

2.5.3 Hierarchical Approach . . . . 12

3 Study Area and Data Description 13 3.1 Study Area . . . . 13

3.2 Data Description . . . . 14

3.2.1 PALSAR . . . . 14

3.2.2 SPOT . . . . 16

3.2.3 Vector Data . . . . 16

3.2.4 DEM . . . . 16

(10)

4 Methodology 19

4.1 Geometric Correction . . . . 21

4.2 Classification Scheme . . . . 21

4.3 Backscatter Profiles . . . . 23

4.4 Image Processing . . . . 24

4.4.1 Speckle Filter . . . . 24

4.4.2 Texture Filter . . . . 24

4.4.3 PCA analysis . . . . 25

4.4.4 Cloud Masking . . . . 25

4.5 ANN classification . . . . 26

4.6 Rule-based / Object-Based approach . . . . 27

4.6.1 Segmentation . . . . 27

4.6.2 Rules for Water . . . . 28

4.6.3 Rules for Forest . . . . 28

4.6.4 Rules for Low Density Built-up . . . . 28

4.6.5 Rules for High Density Built-up . . . . 29

4.6.6 Rules for Roads . . . . 29

4.6.7 Rules for Recreational Areas . . . . 30

4.6.8 Rules for Open Fields . . . . 30

4.7 Accuracy Assessment . . . . 30

5 Results and Discussion 33 5.1 Geometric Correction . . . . 33

5.2 Backscatter Profiles . . . . 33

5.3 Image processing . . . . 35

5.3.1 Texture . . . . 35

5.3.2 Speckle . . . . 37

5.4 Reference Classifications . . . . 37

5.5 ANN Classification . . . . 39

5.6 Segmentation Results . . . . 41

5.7 Rule-Based Classification . . . . 41

5.8 Fusion of SPOT and PALSAR . . . . 45

5.9 Summary . . . . 47

6 Conclusions 49

A Confusion matrices 55

(11)

List of Figures

3.1 Study area . . . . 14

3.2 PALSAR data . . . . 15

3.3 SPOT data . . . . 17

4.1 Flowchart . . . . 20

5.1 Backscatter profiles from PALSAR. . . . 34

5.2 Texture filters . . . . 36

5.3 Speckle filters . . . . 36

5.4 ANN classification . . . . 40

5.5 Segmentation . . . . 42

5.6 Classification results using SPOT and PALSAR . . . . 43

5.7 Classification results using PALSAR . . . . 45

5.8 Comparison of classification with and without PALSAR . . . 46

5.9 Final Classification . . . . 48

(12)

(13)

List of Tables

2.1 RADAR bands . . . . 6

3.1 Configuration of PALSAR sensor . . . . 15

3.2 Configuration of SPOT-5 HRG sensor . . . . 17

4.1 Classification scheme . . . . 22

5.1 Results of the geometric correction . . . . 33

5.2 Standard deviation of Backscatter Profiles. . . . . 35

5.3 Accuracys for ANN classification using SPOT. This classifi- cation is used as a reference to evaluate the perfomnce of the first broad classifier . . . . 37

5.4 Accuracys for reference classification using SPOT and PALSAR 38 5.5 Accuracys for reference classification using SPOT . . . . 38

5.6 Accuracys for reference classification using PALSAR . . . . . 38

5.7 ANN classification using SPOT and PALSAR . . . . 39

5.8 ANN classification using PALSAR . . . . 39

5.9 Accuracy of the hierachical classification using both SPOT and PALSAR. . . . 44

5.10 Accuracy of the hierachical classification using only PALSAR. 44 5.11 Overall accuracies of the different classifiers . . . . 47

A.1 Confusion matrix for classification of SPOT and PALSAR . . 56

A.2 Confusion matrix for reference classification using PALSAR . 57 A.3 Confusion matrix for classification of PALSAR . . . . 58

A.4 Confusion matrix for reference classification using SPOT . . . 59

A.5 Confusion matrix for reference classification using SPOT and

PALSAR . . . . 60

(14)

(15)

Chapter 1 Introduction

1.1 Rationale for the Research

The production of Land-Cover maps from satellite sensors is an important task. To name some applications, the maps produced can be used to extract information on change/growth of the urban area (and then also maybe de- crease in green areas), it can be incorporated in GIS databases of state or local government databases (Shackelford & Davis, 2003b), or it can be used for environmental planning.

As many land-cover types have similar spectral signatures in the optical part of the spectrum, a combination of different types of sensors can be used to imrove the classification. As new SAR sensors with higher resolution have been developed, fusion of SAR with optical data has become a growing field of interest (Orsomando & Lombardo, 2007; Ban, 2003; Ban & Hu, 2007).

Data from SAR sensors and optical sensors can complement each other;

because of the all weather capability of SAR data, the temporal frequency of SAR data is high, but on the other hand, the spatial resolution is often lower. Optical data has lower temporal frequency but often better spatial resolution. (Orsomando & Lombardo, 2007). Another reason for fusion is, as reported by Ban and Hu, that data from different parts of the spectrum provides complementary information and thereby often increases the classi- fication accuracy (Ban et al., 2007).

In recent years, work has been done trying to include SAR images in the classification of urban areas. The all weather capabilities of SAR is one of the main reasons to use it for classification. Another is that SAR conveys the greatest amount ofstructural information for urban areas (Dell’Acqua

& Gamba, 2003). SAR data can also be useful to fill information gaps for regions covered by clouds in optical images, and as Scandinavia is often covered by clouds, this is a valuable property of SAR.

Many different approaches for the joint classification of SAR and Op-

tical data have been investigated, and these methods have been proven to

(16)

generate good results for their specific landuse type, but as a method im- proves the classification accuracy for one landuse type, it might decrease the accuracy for another. In other words, one single method alone will have a problem classifying the area with high accuracy (Ban, 2003; Shackelford &

Davis, 2003b).

Shackelford and Davis has proposed a method to get around this prob- lem; They use a normal Maximum likelihood classifier to separate four well defined groups (Grass-Tree, Road-Buildings, Water-Shadow, Bare Soil), then classifiers specialised for the specific landuse type are used to further separate the four classes into subclasses (Shackelford & Davis, 2003b). This method, called hierarchical approach, is similar to the method described by Ban, called sequential masking classification, where easily distinguishable classes are masked out to let you focus on separating classes with similar signatures (Ban, 2003). Using a hierarchical approach allows you to use different classification methods for different landuse types.

Images acquired by the Advanced Land Observing Satellite (ALOS) and SPOT-5 are used. The ALOS platform carries three sensors, of witch The Phased Array type L-band Synthetic Aperture radar (PALSAR) is used for this study. From the SPOT satellite, a scene acquired in August 2007 by the multipectral HRG sensor is used, and from the PALSAR sensor, two scenes acquired June and July 2007 respectively are used. There are two main reasons for choosing the PALSAR data. First, not much work has been done on the PALSAR sensor yet, and second, and most important, the data suits the objectives. The SPOT HRG sensor was chosen partially because the 10x10 meter resolution matches the 12.5 meter PALSAR resolution, but also because the sensor has bands in both near infrared, mid infrared and in the visual part of the spectrum. The PALSAR image together with the HRG image forms the base for the first broad classification in the hierarchy, where built up, vegetation, bare soil and water are separated. The PALSAR data adds textural information needed for this classification, while the SPOT data adds spectral information.

The hierarchical approach proposed by Shackelford and Davis is used;

the data is separated into sets using a broad classifier, here using an artificial neural network (ANN) classifier. The sets are then refined using different methods for the different subsets. For the built-up areas, a classifier com- bining object based classification, rule based classification and brightness value is used.

The object based classifier together with the rule-based classifier is used to separate buildings by the properties of their surrounding features. The brightness value from both optical multispectral data and SAR texture is used as further parameters in the classification.

For the vegetation set, texture measures from the PALSAR data is com-

bined with brightness value from the multispectral SPOT data for further

classification. Some rule based/object based classifiers are also tested.

(17)

1.2 Research Objectives

The objectives for this research are to improve land cover classification of

urban areas using hierarchical approach, and to investigate the capabilities

of SAR data, and specially PALSAR data together with SPOT data, for

land cover classification of urban areas.

(18)

(19)

Chapter 2 Literature Review

In this chapter, the state of research for land cover classification with SAR and optical data is presented, together with background information about the different techniques used for this research.

2.1 Effects of SAR System Parameters on Urban Land-Cover Classification

The important SAR systems parameters are described below.

2.1.1 Frequency

The Wavelengths in SAR systems are classified into different bands (table 2.1). Although some sensors have the ability to vary between different wave- lengths, normally a sensor is built for one specific band radar. Wavelength determines whether the transmitted signal will be attenuated or dispersed by the atmosphere. Smaller wavelengths will be more affected by the at- mosphere, and for wavelengths below 4 cm the atmospheric effect can be serious. Wavelengths below 2 cm may also be affected by rain and clouds, causing “shadows” in the image. For longer Wavelengths (P-band) with high altitude, the ionosphere will seriously effect the transmission (Lillesand et al., 2004, chap. 2).

Wavelength will also determine what surface will look rough or smooth.

A measure of this is the Ray light criterion, stating that a surface could be described as rough if the Root mean square of the height differences of the surface is bigger than divided by cosine of the local incident angle; else it is described as smooth. A smooth surface will reflect most of the signal away, while a rough surface will scatter the signal, reflecting more back to the sensor (Lillesand et al., 2004, chap. 8).

The wavelength not only affects the roughness of surface, but also the

resolution; all other parameters being the same, azimuth resolution will

(20)

Table 2.1: Radar Bands (Mikhail et al., 2001)

Band Wavelength (cm) Frequency (GHz)

K 0,83-2,75 36-10,9

X 2,75-5,21 10,9-5,75

C 5,21-7,69 5,75-3,9

S 7,69-19,4 3,9-1,55

L 19,4-76,9 1,55-0,39

P 76,9-133 0,39-0,225

increase with decreasing wavelength radar.

In a study on the differences in interpretability between X-band and L- band images of urban areas, it was concluded that best result is achieved when combining both wavelengths, but if only one wavelength were to be used, X-band is preferable (Leonard Bryan, 1975). Haack compared like- and cross polarized L-band and X-band images over Los Angeles, and found that X-band Like polarization was the best, and L-band cross-polarized was the worst for interpretability (Henderson & Lewis, 1998, chap. 15)

For vegetation identification, a general rule is that when the Wavelength approaches the mean size of plant components, the volume scatter increases.

If the canopy is dense, this will lead to stronger backscatter. In general, shorter wavelength (2-6 cm) is better for identification of crop and leaves, while longer wavelength is better for three trunks and limbs (Lillesand et al., 2004, chap. 8).

2.1.2 Polarization

Because SAR consists of coherent waves, the sensor controls the orientation of the waves. The signals can be transmitted in either horizontal or vertical orientation, and when the signals interact with the earth surface, they are sometimes modified to include both polarizations (Mikhail et al., 2001, chap.

11). Some sensors can receive and transmit signals in both orientations, and combined this makes 4 different combinations of signals - HH, HV, VV, and VH, where the first character marks the transmitted signal and the second the received signal. HH and VV are referred to as like-polarized, as the transmitted and received signals are oriented the same way, and HV and VH are referred to as cross-polarized. The cross-polarized return is 8-25dB weaker than the polarized return (Simonette & Ulaby, 1983, chap. 4).

Like-polarized reflection is mostly caused by surface- and volume scat- tering, while the cross-polarized return is mostly caused by either multiple scattering due to surface roughness or multiple volume scattering caused by inhomogeneities (Simonette & Ulaby, 1983, chap. 4). The orientation of the object strongly influences the amount of backscatter in the image.

The images produced by the different combinations of polarization will

(21)

not be identical, but instead help us to differentiate geographic features.

Cross-polarized images are reported to be the best choice for detecting shop- ping centres, institutional complexes and industrial areas - all with the lack of natural vegetation, while like polarized images is reported to be better for discriminating vegetated areas within an urban complex. Cross polarized images are also better for detecting linear features that are not parallel to the flight direction; in general, the effect of orientation is bigger in HH than in HV (Henderson & Lewis, 1998, chap. 15).

In a study on differences between HH and HV in X-band images cover- ing Los Angeles, it was concluded that when comparing 7 urban land cover classes, although they did have differences in response, the only one with statistically different signal response in HH compared to HV was the com- mercial category. Land cover classes with the most similar response in both polarizations were transportation, residential and industrial. It was also con- cluded that no single polarization is preferred for all land cover categories within the urban area (Henderson & Mogilski, 1987).

2.1.3 Incidence Angle

Incidence angle is defined as the angle between the radar line of sight and the local vertical the geoid at the target location. For a horizontal imaging plane, the incidence angle is the complement of the Depression angle, defined as the angle between the local horizontal at the antenna and the radar line of sight (Mikhail et al., 2001, chap. 11). For urban area remote sensing, the incidence angle will affect the detectability of settlements, as lower incidence angle will result in lower range resolution. Low incidence angle will also make the effect of radar shadows bigger; settlements on back slopes will become invisible.

In a study on the effect of radar system parameters on the detectability on settlement detection using SIR-B sensor, it was concluded that images with steep incidence angle (below 20

^◦

) wore were of minimum utility. These images had the poorest resolution, and thereby limiting the ability to locate small features.

The study also showed that the overall detectability and detection accu- racy increased with incidence angle until the angle reached 40, 9

^◦

. After this point, the usability decreased. The study concluded that this might be the best angle for settlement detection, but that more work needs to be made in the area (Hendeson, 1995).

In a multisensor analysis for land use mapping in Tunisia, it was con- cluded that SIR-A images were better than SEASAT images for settlement detection. The two sensors had the same wavelength, polarization and spa- tial resolution. The only major difference were the depression angle; SIR-A has a depression angle of 43

^◦

while the angle of SEASAT is 67

^◦

(Henderson

& Lewis, 1998, chap. 15).

(22)

2.1.4 Summary

This section has described how the system parameters of SAR affect urban land cover classificaiton. To summarize the section, the best choice of fre- quency, if both X and L-band is available, is a combination of of both. If only one is to be used, X-band is preferable.

When it comes to polarization, no single polarization is best for all fea- tures, although cross-polarized return have reported to be the best choice for detecting vegetated areas inside the urban complex, and for mapping industrial complex. Finally, the image used should not be acquired with a too small incidence angle (below 20

^◦

).

2.2 Analysis Methods

2.2.1 Texture

When humans interpret a remotely sensed scene, they take into account things like context, edges, texture and tonal variation. This can be compared to computer processing, where often only tonal information is used. To make the computer interpretation of images more like the one made by humans, and by that improving the classification accuracy, texture filters are often included in the process of image classification (Jensen, 2005, chap 8), (Ban

& Wu, 2005).

Texture can be defined as differences in discreet tonal values over a spe- cific area. A useful way to estimate this when classifying SAR-images is the spatial autocorrelation (Henderson & Lewis, 1998, chap. 2). Autocorrela- tion measures how changes are related to distance. In SAR images, typically two different types of texture are present; scene texture and speckle. Scene texture can be compared to texture in optical images; it is caused by the nature of the surface you are looking at. Speckle on the other hand is caused by the radar or the processing system (Henderson & Lewis, 1998, chap. 2).

Speckle, and how to deal with it in image classification, will be described in the next chapter.

In a study, Dell’acqua and Gamba compared the result of different tex-

ture filter for classification of a ERS-1 image in the urban and sub-urban

areas of Pavia, northern Italy. Eight different second order texture filters,

all based on the co-occurrence matrix, were used; contrast, correlation, dis-

similarity, entropy, mean, homogeneity, second moment and variance. The

study concludes that mean, entropy, second moment and variance, with a

block area of 21 pixels, where among the best choice. Some of the filters

that were best when used alone were strongly correlated, and as a result, the

best combination of four filters was dissimilarity, entropy, mean and variance

(Dell’Acqua & Gamba, 2003).

(23)

2.2.2 Speckle Filtering

All radar images contain some degree of speckle. They are caused by re- flected signals being in our out of phase. This makes pixels in the image, seemingly by random, being brighter or darker (Lillesand et al., 2004, chap.

8). Speckle can be reduced by “Multi look” processing, where several looks of the same area is averaged together, but this will degrade the spatial res- olution of the image (Nyoungui et al., 2002). Another way of reducing the speckle is to use a speckle filter. Many different filters have been developed for this purpose, but when using a speckle filter there is always a trade of between suppressing speckle and preserving edges. The both conditions are hard to combine.

No single filter alone can be said to be the best. It all depends on the data used, what land cover is to be classified, what resolution your image have and so on. Some research have however been done comparing the different filters, as example Nyoungui et al concluded that for land cover classification Lee LS filter gave superior result for visual and computer interpretation (Nyoungui et al., 2002)

2.3 Optical data for urban land-cover mapping

With the development of high resolution remote sensors, and better image segmentation and classification techniques, possibilities for detailed urban land-cover mapping are boundless. But there will still be a need for broader regional coverage that only medium scale resolution imagery (like Landsat TM, SPOT HRG etc.) can provide. The main difficulty when using medium resolution data in urban areas are the mixed pixels (Lee & Lathrop, 2005).

In a study on urban classification over Huston, Texas using a Landsat UTM+ scene, a fuzzy spectral mixture analysis (SMA) was developed, to minimize the problem of mixed pixels. SMA as a method aims to map fractions of landscape classes inside a mixed pixel. The fuzzy SMA approach was shown to reduce the mean absolute (classification) error compared to Maximum likelihood from 0,58 to 0,18 (Tang et al., 2007).

Another study for extracting more information from the mixed pixels, was tested for classification of Landsat ETM+ covering urban areas of New Jersey. Linear mixture modeling (LMM) was used to “unmix” the pixels, and the final classification was compared to a maximum likelihood classifica- tion of an IKONOS image. In the study, it was concluded that the method, together with Landsat ETM+ imagery, could provide a ”reasonablely ac- curate means of estimating urban land cover proportions” (Lee & Lathrop, 2005).

SPOT data has also been used for urban land cover classification. One

example of this is presented in a study where SPOT data is used to capture

the spatial patterns of Beijing. The study reports that the SPOT data, when

(24)

also including GLCM texture mesures , clearlt shows the spatial patterns of the city (classification accuracy 79%) (Zhang et al., 2003).

Another type of sensors often used for urban land-cover classification is high resolution sensors like QuickBird and IKONOS. One example of this is presented in a study where wavelet decomposition is used for urban feature extraction. The results are reported to be promising, and usage of IKONOS images are also suggested for this method (Ouma et al., 2006).

2.4 Fusion of SAR and Optical Data

Traditionally, urban land cover classification has been done using optical images. In later years, some research has been done trying to include SAR images in this process. When land cover classification is performed, it is always good to have information from many parts of the spectrum, and as SAR data represents a different spectrum than the optical image does, it could add information to the classification process, especially as many classes in the urban complex have similar signatures (Kuplich et al., 2000;

Ban, 2003; Ban & Hu, 2007).

The all weather capabilities of SAR data has also proven useful, it can fill in gaps of cloud cover in the optical images (Kuplich et al., 2000; Ban &

Hu, 2007), and it also makes it available at more times; the SAR data can compensate its lack of details with high temporal frequency, while the opti- cal image can compensate its lower temporal frequencies with more details (Orsomando & Lombardo, 2007).

In a study made on fusion of RadarSAT and Quickbird data for classifica- tion of the urban areas of Toronto, it was concluded that the radarSAT data was able to increase the classification compared to only using the QuickBird data. The accuracies of the soybean class increased from 71% to 90%, the classification accuracy of rap increased from 78% to 88% and the accuracy of commercial-industrial areas increased from 63% to 76% when including RadarSAT data compared to only using QuckBird. The SAR data was also used to fill in areas of clouds in the optical data (Ban et al., 2007).

Another study made on synergy with Landsat TM and ERS-1, the radar data was reported to help the discrimination of built up areas due to strong reflectance in corner reflectors, while not helping to discriminate between different crops. They concluded that the radar data could bee seen as an important tool for the discrimination of certain land cover classes (Kuplich et al., 2000). This was also reported by Ban; the classification of crops only using ERS-1 SAR did not give satisfying result, but on the other hand the combination of Landsat TM and ERS-1 data was here reported to increase the classification result compared to only using Landsat TM with 8.3% (Ban

& Howarth, 1999).

(25)

2.5 Image Classification

2.5.1 ANN Classifier

Conventional classifiers make assumptions about the data being classified, for example, many of the classifiers assume that the data is normally dis- tributed. For SAR data, this assumption does not hold; due to speckle, radar data does not follow the Gaussian distribution. An alternative to the classical classifiers is the Artificial Neural Networks (ANN) classifier. ANN presents a distribution free, nonparametric approach (Ban, 2003).

Neural networks simulate the thinking of the human mind, where neu- rons are used to process incoming information. In general, the ANN can be considered to comprise a large number of simple interconnected units that work in parallel to categorize the incoming data to output classes. ANN reaches a solution not in a step-by-step manner or a complex logical algo- rithm, but in a non-algorithmic way, based on adjustments of the weight of connected neurons. Due to the fact that ANN does not assume normal distribution, and due to its ability to adaptively simulate complex and non- linear patterns, the ANN classifier has shown superior results compared to statistical classifiers (Foody et al., 1994), (Jensen, 2005, chap. 10).

2.5.2 Object-Based Classification

In a pixel-based classifier every pixel is classified separately, without consid- ering the formation that pixel is a part of, or what features are surrounding it. Traditionally, only pixel-based classifiers have been used in remote sens- ing. An alternative to the pixel-based approach is the object based. By segmenting the image, in a meaningful way, into objects consisting of many pixels, object based classifiers typically incorporate both spatial and spec- tral information. Compared to a pixel-based approach, where only spectral and textural information is considered, the object based approach can also incorporate shape characteristics and neighbourhood relations (context) to the classification (Shackelford & Davis, 2003a; Ban & Hu, 2007), (Jensen, 2005, chap. 9).

Several studies have indicated that, especially when working with high resolution images, object based rather than pixel-based classification should be used, to get the full potential of the image. Object based classification will also result in a more homogenous and more accurate mapping product, with higher detail in the class definition (Ban & Hu, 2007).

The object based classification starts with segmentation, where the im-

age is divided into objects. The aim of the segmentation is to create objects

with a minimum of interior heterogeneity, where heterogeneity is defined

both regarding spectral and spatial variances. When using the popular clas-

sification program eCognition, homogeneity in the spatial domain is defines

(26)

by compactness and smoothness. Compactness is measured as a ratio be- tween the objects border length to the objects total number of pixels, while smoothness is measured as a ratio between the objects border length and length of the objects bounding box (Ban & Hu, 2007), but there are of course many other ways of defining the heterogeneity (Jensen, 2005, chap. 9).

The actual classification of the image is done in a different manner than that in a pixel-based approach. The analyst is not constrained to just using the spectral information, but can use both the mean spectral information of the segment, or various shape measures. This introduces flexibility and robustness (Jensen, 2005, chap. 9).

2.5.3 Hierarchical Approach

Often, not one single classification method, or one single set of data are appropriate for the classification of all features in an image. The hierarchical approach allows for the use of specific classification methods for different classes. In the hierarchical approach, you first classify the image into broad categories, with similar signatures. These broad classes are then further separated into finer classes (Shackelford & Davis, 2003b). An approach where the same classifier is not used for all land cover types is referred to as a hybrid approach (Lo & Choi, 2004).

An example of both hybrid and Hierarchical approach has been presented by Ban. The method described is called Sequential-masking approach. In this method the most distinct feature is classified first, and then masked out before the next land-cover type is classified. Not all features are classi- fied with the same image; instead images from different dates are used for different features (Ban & Howarth, 1999).

Another example of this approach is presented by Shackleford and Davis.

In their attempt to classify IKANOS images over urban areas, they use both techniques. First, the image is classified using a normal Maximum likelihood classifier to separate four well defined groups (Grass-tree, Road-buildings, water-shadow, Bare soil). They then use methods specially design for every group to further refine them. For example they show that the entropy texture measure was good for separating grass from threes (Shackelford &

Davis, 2003b).

(27)

Chapter 3 Study Area and Data Description

3.1 Study Area

The study area for this research is Stockholm. Stockholm is the political and economical centre of Sweden. The city is located in the east part of Sweden, just at the boundary between the lake Mlaren, and the Baltic see.

2006, the number of inhabitants in Stockholm was 783 000. At the same time, 1,9 million peoples were living in the region (stockholm.se). The city is growing at a rate of approximately 1% per year, giving the expected size of the region to be between 2.2 and 2.4 million inhabitants 2030.

The settlement structure of Stockholm can be described to include a clear inner structure, with a strong regional core, radial settlement bands, green wedges and a large archipelago. The central parts of Stockholm are actually built on some of those islands.

The green wedges, which there are ten of, give the city ”green corridors all the way in to the central parts of the city. The archipelago consists of about 25 000 islands, from which many are populated, utilized and preserved (RUFS, 2002).

The Stockholm region contains a large diversity of land-cover types: res- idential areas, dense urban centres, roads, agricultural areas, parks, and recreational areas.

The average temperature of Stockholm is 16

^◦

C during summer and

−3

^◦

C to −5

^◦

C during winter. The annual amount of rain is between 450 and 650 mm. The fact that most of the rain falls during summer and fall, makes it difficult to find good optical remotely sensed images of Stockholm.

In the last ten years, a dramatic change in climate has appeared; statisti-

cal data shows that the temperature has increased with almost one degree

compared to data from 1961 to 1990, and looking at the same period, the

Amount of rain has increased with about 10% (SMHIa, 2005; SMHIb, 2005).

(28)

Figure 3.1: Study area. The left part of the images shows Sweden. The pink box marks the position on the map of the zoomed in area to the right, where the SPOT

image marks the actual study area. Vector data property of Lantm¨ateriet.

3.2 Data Description

3.2.1 PALSAR

The Advanced land observation satellite (ALOS) was launched in January 2006 by the Japanese Space Exploration Agency (JAXA). The satellite is placed in a sun synchronous orbit at 691 km, with a temporal pass structure of 17 or 29 days. Onboard, it carries three instruments: one panchromatic sensor for stereo mapping (PRISM), one multispectral sensor (AVNIR-2) and finally the Phased Array L-band Synthetic Aperture radar (PALSAR).

PALSAR is an enhanced version of JERS-1. The sensor is a fully polarimet- ric instrument operating in the L-band, with a centre frequency of 1270 MHz, or 23,6 cm. There are five different modes for PALSAR; Fine Beam single po- larization, Fine Beam Dual polarization Polarimetric mode, ScanSar mode and Direct transmission mode (Rosenqvist & Shimada, 2007).

In this research, two PALSAR images are used, both acquired in the Fine

Beam Dual Polarization mode during summer of 2007 (figure 3.2). The first

(29)

Figure 3.2: Two PALSAR images acquired in the Fine Beam Dual Polarization mode during summer of 2007. The left image was acquired June 18, and the right

July 17. Both are used for the land-cover classification in this thesis.

Table 3.1: Configuration of PALSAR sensor, where FBS means Fine Beam Sin- gle Polarization, FBD means Fine Beam Double polarization, DSN means Direct Downlink, PLR means Polarimetry and WB1 means ScanSAR. Out of the possible 132 modes, this table shows the standard configurations (ESA). For this thesis, an

image acquired in FBD-mode was used.

FBS FBD DSN PLR WB1

Pixel Spacing 12.5 m 12.5 m 12.5 m 12.5 m 100 m

Off-nadir angle 34.3 34.3 34.3 21.5 20.1-36.5

Incidence angle 7.5-60.0 7.5-60.0 7.5-60.0 8-30 18.0-43.3

Swath Width 70 m 70 km 70 km 30 km 35 km

Polarisation HH HH/HV HH HH/HV + HH

VV/VH

(30)

image was taken June 18, and the second July 17. The ground resolution of the images is 12,5 x 12,5 meters and off nadir angle 34.3 degrees. The polarizations in both images are HH and HV.

3.2.2 SPOT

The first SPOT satellite was launched in February 1986 by the French gov- ernment in participation with Sweden and Belgium. The satellite began a new era in remote sensing by using a linear array sensor and employ pushb- room scanning techniques. Shortly before the first satellite retired in 1990, SPOT-2, with similar configuration as SPOT-1, was launched. The same configuration was also used for SPOT-3, launched in 1993. SPOT-4 was an improved version of SPOT 1-3, and launched in 1998.

On May 3 2002, the SPOT program entered a new era with SPOT- 5, caring two High Resolution Geometric (HRG) sensors. These sensors can be used to acquire either panchromatic images, with a resolution of 2.4 or 5 meters, or to acquire multispectral images. The resolutions of the multispectral images are 10 meters in green, red and near infrared, and 20 meters in the mid-infrared. The HRG-sensors are also pointable in ±31

^◦

, to make the revisit time smaller (Lillesand et al., 2004, chap. 6).

For this research, a multipectral HRG scene acquired by SPOT-5 July 27 2007 was used (figure 3.3).

3.2.3 Vector Data

As reference data for the geometric correction of the images, vector data from the Swedish land survey was used. This data, called terrngkartan, has a resolution of 5x5 meters, and is projected in the Swedish system RT90 2.5 gon W ((Swedish Land Survey)). In the dataset, all features normally used in a map are included, but for the geometric correction, only the road data was used. For orientation purposes, the height curves where included in the process.

3.2.4 DEM

An elevation model with a resolution of 50 x 50 meter from the Swedish land survey was used for the ortho-rectification of the SPOT- and PALSAR-data.

This DEM is also projected in RT90 2.5 gon W.

(31)

Figure 3.3: A multispectral HRG scene acquired by SPOT-5 July 27, 2007. Here shown in false colour: R: NIR, G:Red, B:Green.

Table 3.2: Configuration of SPOT-5 HRG sensor (Lillesand et al., 2004).

SPOT-5 HRG

Spectral Bands Panchromatic (5/2.5 m) and resolution Multispectral (10 m)

Short-Wave Infrared(20 m) Spectral Range P: 0.48 - 0.71 µm

B1: 0.50 - 0.59 µm B2: 0.61 - 0.68 µm B3: 0.78- 0.89 µm B4: 1.58 - 1.75 µm

Swath 60 km to 80 km

Incindence Angle ±31.06

Revisit Interval 1 to 4 days

(32)

(33)

Chapter 4 Methodology

The methodology for this study is briefly described by the flowchart on page 20; As some of the study area is covered by clouds in the SPOT image, two different classifications are done, one using both SPOT and PALSAR (left flow in flowchart) and one using only PALSAR (right flow in flowchart).

The classification using only PALSAR is used only for the areas covered by clouds in the SPOT image, for all other areas, the joint classification of SPOT and PALSAR is used.

Fo both classifications, the hierarchical approach proposed by Shack- elford and Davis is used; the data is separated into sets using a broad clas- sifier, here using an artificial neural network (ANN) classifier. The sets are then refined using different methods for the different subsets.

The first step in the flow was to geometrically correct and orthorectify both images. The SPOT image was corrected using a normal rationa-model with GCP:s, while the PALSAR iamge was first corrected using orbital mod- els. When this was done, backscatter profiles were extracted from all PAL- SAR layers. The SAR data also went trough a process of speckle- and texture filtering.

The ANN classifier was used to separate the images into four broad, and easily distinguished classes; Urban, Water Open Area and Forest. This was done once using only PALSAR and once using a combination of PALSAR and SPOT. The four classes was then refined to their final classes, Water, Forest, Low Density Built-up (LD), High Density Built-up (HD), Road, Recreational Area and Open Field, by different methods for every class, using the object based program eCognition.

To be able to control if the hierarchical approach managed to improve the land-cover separation for the seven classes, reference classifications were created using an ANN classifier. The accuracy of the two resulting land- cover maps was then compared with the accuracy of the reference maps.

Finally, the two classifications were merged; the black holes in the joint

classification were replaced by the PALSAR classification.

(34)

palsar spot

cloud mask

geometric correction

texture speckle

vector data elevation model

backscatter profiles orbit model

accuracy assessment

first classification (ann)

final classification

accuracy assessment

first classification (ann)

final classification clouded

areas

clou

d-freeareas

accuracy assessment

Figure 4.1: Flowchart as decribed in chapter 4.

(35)

4.1 Geometric Correction

If combining different datasets in one project, images needs to be correct for geometric errors. Satellite images contain two kinds of geometric er- rors; systematic and non-systematic. The systematic errors can be caused by scan skew, panoramic distortion, platform velocity nonlinearities, per- spective geometry and earth rotation. These errors can be corrected for using data from the platform. The non-linear errors are mainly caused by variations through time in the position and attitude angles of the satellite platform. These errors can only be corrected for using correction by Ground Control Points (GCPs) (Sertel et al., 2007). GCPs are points with known coordinates both in the image and in a terrestrial coordinate system. These points are used in a mathematical model, usually polynomial- , affine- or rational model, to correct the image by a least square adjustment (Sertel et al., 2007).

In this research, the PALSAR images where first imported using the SAR orthorectification tool in Geomaticas OrthoEngine. This tool uses orbit parametrs from the satellite platform to correct the image. After this initial processing, both the PALSAR images and the SPOT image were further corrected using the rational model is used. The rational model is a simple math model that builds a correlation between the pixels and there ground location, using two polynomial functions for row and two for column. This model is more accurate than the polynomial model, as it includes elevation in the correction.

When geometrically correcting an image, the number of coefficients needs to be decided. Adding more coefficients in the model will make the fitting better close to the GCPs, but will introduce new significant errors away from the control points (PCI Geomatica).

4.2 Classification Scheme

To be able to successfully extract land cover/land use maps from remotely sensed data, classes must be carefully selected. The classes must be mu- tually exclusive (there is no overlap between classes), exhaustive (all land cover/land use types present in the image must be included), and hierarchi- cal. This requires the use of classification schemes, with correct definition of the classes and a logical structure (Jensen, 2005, chap. 9).

One of the most commonly used classification schemes are created by

the U.S.Geological survey. It is a multilevel classification scheme, including

all types of classes. As some of the classes in this scheme need very high

resolution images to be extracted, many organizations have tried to modify

it, to better suit the classification of remotely sensed images. One of the

most used modifications is the one developed for the National Land Cover

(36)

Table 4.1: Classification scheme. Left, the classification scheme based on U.S geological survey, to the right, the classes developed for this thesis.

Original Classes Developed Classes

1 Water 1 Water

11 Open Water 11 Water

12 Perennial Ice/Snow

2 Developed 2 Developed

21 Low Intensity Residential 21 Low Density Built-up 22 High Intensity Residential 22 High Density Built-up 23 Commercial/Industrial/Transportation 23 Roads

3 Barren

31 Bare Rock/Sand/Clay

32 Quarries/Strip Mines/Gravel Pits 33 Transitional

4 Forested Upland 4 Forested Upland

41 Deciduous Forest 41 Mixed Forest

42 Evergreen Forest 43 Mixed Forest 5 Shrubland 51 Shrubland

6 Non-Natural Woody 61 Orchards/Vineyards 7 Herbaceous Upland Natural/

Semi-natural Vegetation 71 Grasslands/Herbaceous

8 Herbaceous Planted/Cultivated 8 Herbaceous Planted/Cultivated

81 Pasture/Hay 81 Agricultural Fields

82 Row Crops 82 Urban/Recreation

83 Small Grains 84 Fallow

85 Urban/Recreation 86 Grasses

9 Wetland

91 Woody Wetlands

92 Emergent Herbaceous Wetlands

(37)

Dataset and Costal Change analysis program (NOAA) (table 4.1) (Jensen, 2005, chap. 9). This scheme forms the base for this research classification scheme.

No Classification schemes are ideal for all situations, and therefore they have to be modified (Estes & Thorley, 1983, chap. 30). It is not appropriate to try to extract classes that, because of spatial resolution and interpretabil- ity, are extremely difficult to obtain. Classes from the scheme not present in the scene used must also be excluded.

A modified classification scheme based on the classification scheme used for the National Land Cover Dataset and NOAA has been developed for this research (shown in 4.1).

First, the classes Perennial ice/snow, Barren, Scrubland, Non-natural woody, Herbaceous Upland Natural/Semi-natural Vegetation and Wetland has been removed, as these classes are not present in the scenes used for this research. As no field data, or enough multidate images to classify by the phenological cycles (Ban & Howarth, 1999) where available, different crops could not be separated. The Herbaceous Planted/Cultivated-classes where joined to only two categories; Open Field and Recreational Areas.

Regarding the class Commercial/Industrial/Transportation - as Commercial and Industrial could not be separated from High Density Built-up neither by pixel- or object based classification, it was included in the class High Density Built-up. Transportation formed the class Road.

4.3 Backscatter Profiles

In the processing of the radar images, a scaling operation has been per- formed. Each digital number (DN) in the pixels represents a magnitude value of the cross product component in the data. For extraction of cali- brated data, this process has to be reversed. From the DN, a unitless Beta Nought value (β) is calculated with the equation

β

^◦_j

= 10 · log DN

_j²

+ A3 A2 · ratio

!

(4.1) where A2 is the constant scaling gain, A3 is a fixed offset and ratio is a fixed ratio (for PALSAR always 131072) of the j

^th

pixel. The values can be found in the metadata of the files.

To extract Backscatter values (σ = Sigma Nought), the Beta Nought has to be corrected for incidence angel with the formula

σ

_j^◦

= β

_j^◦

+ 10 · log (sin (I

j

)) (4.2)

where I

j

is incindence angle of the j

^th

pixel.

(38)

Calculating the incidence angle is actually the most difficult part of the process, and includes calculations of earth radius, satellite altitude and slant range for each pixel (JAXA, 2008). Luckily, version 10.1.1 of Geomatica includes a set of toolboxes for these calculations; CDSAR, SARINCD and SARSIGM (PCI Geomatica). These toolboxes are used to extract Sigma Nought for each pixel in the four images (June 18 HH and HV, July 17 HH and HV).

To extract backscatter profiles, training areas of each land-cover type must be selected. To be able to select the same training areas in all of the Sigma Nought images, (June 18 and July 17 HH and HV), the images must be geometrically corrected. This is done with the parameters described in part 4.1 Geometric Correction, and nearest neighbour resampling. From each land cover type, about 2000 pixels where selected, and the mean of each class was calculated in Matlab.

4.4 Image Processing

4.4.1 Speckle Filter

Four of the most commonly used speckle filters are Lee, Kaun, Frost and Gamma. Lee and Kaun work the same way, but with different signal model assumption; by computing linear combinations of the centre pixel and the average of the other pixels in the window. This way the filter maintain a balance between both average and identity filter. The Frost filter also adapt to the amount of variation within the window, but by forming an exponentially shaped filter kernel. If the window contains big variations, the filter averages. If the variations are smaller, no averaging is done.

Gamma filters and extended Lee and Frost filters have three different modes; if the variations are below a lower threshold, pure averaging is used.

If the variations are above a higher threshold a strict all-pass filtering is performed. If the variance is between the two thresholds, a balance of the two are performed (like with the Lee-, Kaun- and Frost filters)(Yu & Acton, 2002).

For trying out the speckle filters, a comparison between Enhanced Frost, Enhanced Lee, Frost, Gamma, Kaun and Lee-filters were made. All of the filters were used for a test classification using ANN classifier of a sample area of the PALSAR scene. In the test, four classes were used; Open Field, Forest, Built-up Area and Water. Five different window sizes of every speckle filter was tested; 3, 5, 7, 9 and 11 pixels.

4.4.2 Texture Filter

Two types of texture analysis often used are the first- and second-order

statistics. The first order statistics includes filters like mean, average, vari-

(39)

ance standard deviation and entropy (Jensen, 2005, chap. 8). The second order statistics are based on the grey level co-occurrence matrix (GLCM).

The GLCM-method involves two steps. First, the spatial information of an image is calculated by a co-occurrence matrix moving over the image. Sec- ond, the GLCM information calculated in the first step is used to compute statistics to the describe the spatial information according to the relative position of the matrix elements, typically including measures like angular second moment, inverse difference moment, contrast, entropy and correla- tion (Gong et al., 1992).

The information that could be extracted from the texture in the SAR- image should in this research be used to extract the four classes Water, Forest, Urban, and Open area. By visual examination of the image, it was clear that the largest difference in texture was that between Forest and Open Area. The main purpose of the texture filter was therefore to extract this difference clearly.

A comparison similar to the one made with the speckle filters was done to evalutae wich filter/filters to use, and as described in section 5.3.1, the used filters were dissimilarity, entropy and mean.

4.4.3 PCA analysis

The Neural network classifier in Geomatica can take maximum 16 layers as input, and with all three texture filters applied on all four images (June 18, July 17 both HH and HV), 12 layers were produced only here, and as both specklefiltered images and optical images where used also, the layers had to be compressed.

A commonly used technique to compress large datasets is the Principal component Analysis (PCA). Without losing much of the original informa- tion, PCA compresses the data into smaller datasets. The first component of the analysis contains the most variance of the original dataset, and subse- quent orthogonal components hold the maximum of the remaining variance (Jensen, 2005, chap. 8). The PCA was used to compress the 12 texture layers to 4.

4.4.4 Cloud Masking

In the final classification, a land cover map produced only by the PALSAR

data should fill in where the SPOT data was covered with clouds. Therefore,

clouds where masked out from the dataset used in the joint classification of

SPOT and PALSAR. This was done manually, by drawing polygons that

covered all clouds, and then setting the pixels inside those polygons to zero.

(40)

4.5 ANN classification

The first step of the classification was to separate water, forest, open areas and urban areas. As discussed in the literature review (section 2.5.1), the most suitable method for this research is the ANN classifier.

The feed-forward back-propagation ANN typically has input layers, hid- den layers and output layers. The input layer consists of the data being classified, for example different SAR-data, optical data, texture layers and so on. The hidden layer simulates non linear patterns of the input data and finally, the output layer is the classified map.

The first part of the classification process is the training. A training pixel is selected by the user, and its type (class) is sent to the output layer at the same time as the values of that pixel in the input layers are passed throw the network. The output neuron for this class is assigned a membership value of 1 while the other output neurons are being set to 0.

The second step of the process is learning. In this step, areas selected as training areas by the user are being sent through the network. For each training example, the output of the system is compared with the true value, and differences between the two are regarded as errors. The weights in the hidden layer are updated with the same amount as the error, and the learning process starts again. This will continue until the errors are smaller than a predefined threshold, when the classification is considered to converge. The final weights are stored in the hidden layer.

The last part of the classification is the testing. In this phase, every pixel of the input data is being sent through the neurons in the hidden layers, and is here assigned a membership value between 0 and 1, showing the amount of membership of the pixel to each output class. Finally the pixel is assigned to the class were its membership value is highest (Jensen, 2005, chap.10).

In this research, the neural network classifier in Geomatica 10.1.1 was used. Geomatica’s neural network classifier is a supervised classifier consist- ing of three modules, NNCREATE, NNTRAIN, and NNCLASS. NNCRE- ATE creates a neural network segment for back propagation, NNTRAIN trains the network and finally, NNCLASS uses the trained network to clas- sify the images.

Input to the system was training areas for each class, and the image files. The maximum number of iterations is here set to 1500, but often the classifier converges before this limit is reached. For each classification, a model was built in Geomatica Modeller.

As some of the areas in the SPOT-image are covered with clouds, one

classification only using the PALSAR image was done. This classification

was used to fill in for the information gaps caused by clouds in the optical

image. For the rest of the study area, a joint classification of PALSAR and

SPOT was used. In the Joint classification of SPOT and PALSAR, all four

optical layers (Green, Red, NIR and SWIR), the kuan filtered SAR-images

(41)

and the three first PCA components of the texture measure were used. In the classification of the PALSAR image, a texture layer with larger scale was added, and instead of the Kaun speckle filter, an 11x11 pixel Enhanced Frost was used.

Finally, a classification only using SPOT data was created, to use as a reference. This image was used to see whether including SAR-data had improved the classification compared to only using optical images.

4.6 Rule-based / Object-Based approach

The second step in the hierarchy was to separate the four classes from the ANN classifier further. Urban was classified to High Density Built-up (HD), Low Density Built-up (LD) and Roads. Open Area was separated into Field and Recreational Area. Some parts of the ANN classified areas also had to be “cleaned”, as they had been misclassified in the first step of the hierarchy.

For this second level in the hierarchy, an object-based approach using the software Definiens Professional 5 was used. Definiens Professional 5, or eCognition as it is also called, is a program for object based classification.

As in the ANN part, two classifications is done; one only using the PALSAR-data, and one where both PALSAR and SPOT data is used. In both classifications, the first step is to segment the images only based on the previous ANN classification. This way, the objects will reflect the first step in the hierarchy. The second step is to add more levels to the segmentation.

The lower levels are based on the first segmentation, and can not go outside its boundary. This means that a segment in the lower level can not belong to more than one segment from a higher level, and therefore the lower segments will also only belong to one of the ANN classes.

4.6.1 Segmentation

In object based approaches, the image is segmented into objects, based on three criteria: shape, scale and homogeneity in spectral value. The scale parameter defines how large or small the objects should be. A larger number will result in larger segments. The shape measure is divided into two sub categories; Smoothness and compactness. Smoothness is a ratio between the objects border length and its total number of pixels, while the smoothness is a ratio between border length and the length of the objects bounding box, and is at its minimum when the object is not frayed. The brightness criteria states that every object should have as high homogeneity in brightness as possible (Ban & Hu, 2007; eCognition).

In the joint classification of PALSAR and SPOT, two sublevels were

created to the segmentation. Both levels were segmented based on the four

SPOT layers. The second level of segmentation was set to a scale of 100,

weight of shape to 0,3 and compactness to 0,5. The third level was segmented

(42)

using a scale of 45, weight of shape was set to 0.9 and the compactness to 0.1. The high weight of shape in the third level was motivated by the fact that roads where best segmented with high weight on shape.

In the segmentation of the PALSAR alone, the four original SAR-layers together with a texture layer were used. The lower resolution of the PALSAR- image requested for a higher scale, and therefore it was set to 150. The weight of shape was set to 0,9, and weight of compactness to 0,2. Only two levels of segmentation were used here.

4.6.2 Rules for Water

With the joint classification of PALSAR and SPOT, the Water was separated well by the ANN classifier, and nothing had to be done to this class here.

Using only PALSAR however, some Water had been misclassified as open area. This was caused by the texture filter, that because of its large kernel size (11x11), created a border where the Forest and Built-up, with high backscatter, bordered to water, with very low backscatter. With an object- based approach, these misclassified water areas could be turned into Water.

In the lower level of segmentation, rules saying that if an object had

• High length/width ratio

• About 50% of its boundary boarding to Water and 50% to Urban

• Or 50% of its boundary boarding to Water and 50% to Forest Then the object should be changed from Open Area to Water.

4.6.3 Rules for Forest

The Forest areas were classified with high accuracy for both of the projects already by the ANN classifier, and no further classification needed to be done.

4.6.4 Rules for Low Density Built-up

To extract Low Density Built-up (LD), a special context measure was de- veloped. The idea is that LD can be distinguish from High Density Built-up (HD) by the amount and size of the houses. In a LD area, houses are gener- ally more sparsely distributed, with areas of Forest, grass and Roads between them. To be able to describe this relation mathematically, a layer showing the amount of pixels being classified as Built-up in a specific distance from each pixel was developed.

First, every pixel that had been classified as urban by the ANN classifier

was set to 255, while all other pixels were set to 0. Then, an ordinary mean

filter was run over the image. This way, every pixel was assign a value

(43)

between 0 and 255, where 255 means all areas with a distance equal to the kernel size of the mean filter are classified as Urban, and 0 means no pixels (within the same distance) are classified as Urban.

Three different kernel sizes were evaluated; 11x11, 25x25 and 51x51 pix- els. After some testing, it was concluded that the 25x25 pixel kernel size was best fitted to the objectives.

The new layer was then added to eCognition, and thresholds were set for when to consider an object as Low Density Built-Up. For the joint classification, this threshold was set to 121 (< 47% Urban), while for the PALSAR project the threshold was set to 50 (< 18% Urban). The differences in threshold values are due to different classification patterns for the two projects.

4.6.5 Rules for High Density Built-up

All areas classified as urban by the ANN classifier, and not classified as LD or Road in the second step of the hierarchical classifier, was set to High Density Built-up.

4.6.6 Rules for Roads

When only using the SPOT image, Roads are difficult to separate from other man-made features, as they all have similar spectral signatures. On the other hand, Roads could not be separated from Open Area when only using PALSAR for classification. But when combining PALSAR and SPOT in the same classification, Roads can, at least to some degree, be separated.

When also including the fact that road segments have high length/width ratio, Roads were separated well.

In the joint project, Roads were well classified into the urban class by the ANN classifier. When using the PALSAR alone, however, the Roads were separated into several classes (Open, Urban and Forest), so here, the Roads had to be extracted on a higher level. A new project was created in eCognition, including the PALSAR-images and some texture layers. These were then segmented, with the scale parameter set to 30, and with a high weight on shape, with extracting Roads as its only objective. The rules for this project included standard deviation of brightness values, mean value of brightness values and a length/width measure.

After the original PALSAR project was classified into all other classes, the extra Road layer was put on top of it, saying that pixels classified as Road in the special road project, should also be Road in the final classified map. All other pixels kept their classification.

In the joint classification of SPOT and PALSAR, roads were extracted

from the Urban class of the ANN classifier by the conditions that length/width

(44)

ratio should be higher than 18 and the backscatter in the HH images of PAL- SAR should be low.

4.6.7 Rules for Recreational Areas

Recreational Areas were separated from Open Field by their distance to HD and LD. The more area of Urban within a curtain distance from the Open Area segment, the more likely the object is to be a Recreational Area. This relation was set as a rule in eCognition.

4.6.8 Rules for Open Fields

A common problem when using pixel based classifier is that bare wet soil, and urban areas have the same spectral signatures. Even if the PALSAR managed to help separating the two features to some degree, still many bare fields were misclassified as Urban areas by the ANN classifier. In an object based environment, most of these misclassified fields can be recognized.

A Field is often rectangular, or at least more rectangular than most of the other features in a remotely sensed image. The standard deviation of the brightness values is also low, as the Field is often quite homogenous.

Finally, Fields generally have lower backscatter than Urban in the PALSAR image.

When only using the PALSAR image for the first classification, the prob- lem was instead that some small Open Areas were misclassified as Water.

The fact that these misclassified features were totally surrounded by Open Areas, made it possible to recognize them.

The parameters mentioned above were set as rules in eCognition, and with some adjustment of the numbers in the parameters, most of the mis- classified features were recognized.

Other than this, Open Field were defined as not being Recreational Ar- eas, i.e. all areas not classified as Recreational Area where set to Open Field.

4.7 Accuracy Assessment

To evaluate the different classifications in an objective way, an accuracy assessment must be performed after each classification. There are two main types of accuracy assessments; qualitative confidence building assessment and statistical measures.

Qualitative confidence building assessment involves visually examine the classification, to find gross errors - areas that are clearly misclassified. This method is mostly used in the iterative process of improving the classification.

Statistical measures are divided into two sub categories, model-based

inference and design-based inference. Model-based inference investigates

(45)

the actual classification model, for example by checking the probabilities for each pixel to be classified to a certain class. Design-based inference statistically measures how accurate the classification is based on samples (Jensen, 2005, chap. 13). In this study, design-based inference was used.

500 pixels from each class were randomly selected as true values from the SPOT image. These true pixels were then compared, by Geomatica’s post classification tool, with the classified images. Measures calculated were Overall accuracy, Overall Kappa coefficient, Producer’s accuracy, User’s ac- curacy and Kappa coefficient.

Producer’s accuracy is a measure to describe how many of the pixels in the sample data that where classified right in the produced map. The user’s accuracy is calculated by dividing all pixels correctly classified to a class, with all sample pixels classified to that class. This is a measure on how much a certain class is “over classified”.

The Kappa coefficient is a measure of agreement between the remote sensing derived classification map and the reference data. Kappa values above 0,8 represents a strong accuracy, while values between 0,4 and 0,8 represents a moderate accuracy (Jensen, 2005, chap. 11).

To be able to use the same true pixels for both SPOT and PALSAR

classifications, they were only selected in the areas not covered by clouds in

the SPOT image.

(46)

Land-Cover Mapping in Stockholm Using Fusion of ALOS PALSAR

Land-Cover Mapping in Stockholm Using Fusion of ALOS PALSAR

and SPOT Data

Johan Wallin

Master’s of Science Thesis in Geoinformatics TRITA-GIT EX 08-012

School of Architecture and the Built Environment Royal Institute of Technology (KTH)

100 44 Stockholm, Sweden

October 2008

Land-Cover Mapping in Stockholm Using Fusion of ALOS PALSAR and SPOT Data

Johan Wallin

October 2008

Acknowledgement

I would like to express my gratitude towards my supervisor Yifang Ban for her support throughout this work.

This research was support by the Swedish Research Council for Envi-

ronment, Agricultural Sciences and Spatial Planning (FORMAS) awarded

to my supervisor Professor Yifang Ban.

Abstract

The objective of this research was to investigate the capabilities for land- cover classification using fusion of SAR data from the PALSAR sensor and optical data from the SPOT sensor using a hierarchical approach.

As some areas in the optical image were covered by clouds, a hierarchical classification using only PALSAR was made. This classification was used to fill in for “information gaps” in the joint classification of SPOT and PALSAR.

The research shows that PALSAR data to some degree can be used to

improve the land cover classification in urban areas, and the hierarchical

approach increases the classification accuracy compared to pixel-based clas-

sification.

Contents

Acknowledgement iii

Abstract v

1 Introduction 1

1.1 Rationale for the Research . . . . 1

1.2 Research Objectives . . . . 3

2 Literature Review 5 2.1 Effects of SAR System Parameters on Urban Land-Cover Classification . . . . 5

2.1.1 Frequency . . . . 5

2.1.2 Polarization . . . . 6

2.1.3 Incidence Angle . . . . 7

2.1.4 Summary . . . . 8

2.2 Analysis Methods . . . . 8

2.2.1 Texture . . . . 8

2.2.2 Speckle Filtering . . . . 9

2.3 Optical data for urban land-cover mapping . . . . 9

2.4 Fusion of SAR and Optical Data . . . . 10

2.5 Image Classification . . . . 11

2.5.1 ANN Classifier . . . . 11

2.5.2 Object-Based Classification . . . . 11

2.5.3 Hierarchical Approach . . . . 12

3 Study Area and Data Description 13 3.1 Study Area . . . . 13

3.2 Data Description . . . . 14

3.2.1 PALSAR . . . . 14

3.2.2 SPOT . . . . 16

3.2.3 Vector Data . . . . 16

3.2.4 DEM . . . . 16

4 Methodology 19

4.1 Geometric Correction . . . . 21

4.2 Classification Scheme . . . . 21

4.3 Backscatter Profiles . . . . 23

4.4 Image Processing . . . . 24

4.4.1 Speckle Filter . . . . 24

4.4.2 Texture Filter . . . . 24

4.4.3 PCA analysis . . . . 25

4.4.4 Cloud Masking . . . . 25

4.5 ANN classification . . . . 26

4.6 Rule-based / Object-Based approach . . . . 27

4.6.1 Segmentation . . . . 27

4.6.2 Rules for Water . . . . 28

4.6.3 Rules for Forest . . . . 28

4.6.4 Rules for Low Density Built-up . . . . 28

4.6.5 Rules for High Density Built-up . . . . 29

4.6.6 Rules for Roads . . . . 29

4.6.7 Rules for Recreational Areas . . . . 30

4.6.8 Rules for Open Fields . . . . 30

4.7 Accuracy Assessment . . . . 30

5 Results and Discussion 33 5.1 Geometric Correction . . . . 33

5.2 Backscatter Profiles . . . . 33

5.3 Image processing . . . . 35

5.3.1 Texture . . . . 35

5.3.2 Speckle . . . . 37

5.4 Reference Classifications . . . . 37

5.5 ANN Classification . . . . 39

5.6 Segmentation Results . . . . 41

5.7 Rule-Based Classification . . . . 41

5.8 Fusion of SPOT and PALSAR . . . . 45

5.9 Summary . . . . 47