Image interpretation & analysis - Lecture Note - Lecture Material
Image interpretation & analysis


interpretation and analysis

In order to take advantage of and make good use of remote sensing data, we must be able to extract meaningful information from the imagery. This brings us to the topic of discussion in this chapter - interpretation and analysis - the sixth element of the remote sensing process which we defined in Chapter 1. Interpretation and analysis of remote sensing imagery involves the identification and/or measurement of various targets in an image in order to extract useful information about them. Targets in remote sensing images may be any feature or object which can be observed in an image, and have the following characteristics:

  • Targets may be a point, line, or area feature. This means that they can have any form, from a bus in a parking lot or plane on a runway, to a bridge or roadway, to a large expanse of water or a field.
  • The target must be distinguishable; it must contrast with other features around it in the image.

Image displayed in a pictorial or photograph-type format

Much interpretation and identification of targets in remote sensing imagery is performed manually or visually, i.e. by a human interpreter. In many cases this is done using imagery displayed in a pictorial or photograph-type format, independent of what type of sensor was used to collect the data and how the data were collected. In this case we refer to the data as being in analog format. Images represented on a computerAs we discussed in Chapter 1, remote sensing images can also be represented in a computer as arrays of pixels, with each pixel corresponding to a digital number, representing the brightness level of that pixel in the image. In this case, the data are in a digital format. Visual interpretation may also be performed by examining digital imagery displayed on a computer screen. Both analogue and digital imagery can be displayed as black and white (also called monochrome) images, or as colour images (refer back to Chapter 1, Section 1.7) by combining different channels or bands representing different wavelengths.

When remote sensing data are available in digital format, digital processing and analysis may be performed using a computer. Digital processing may be used to enhance data as a prelude to visual interpretation. Digital processing and analysis may also be carried out to automatically identify targets and extract information completely without manual intervention by a human interpreter. However, rarely is digital processing and analysis carried out as a complete replacement for manual interpretation. Often, it is done to supplement and assist the human analyst.

Manual interpretation and analysis dates back to the early beginnings of remote sensing for air photo interpretation. Digital processing and analysis is more recent with the advent of digital recording of remote sensing data and the development of computers. Both manual and digital techniques for interpretation of remote sensing data have their respective advantages and disadvantages. Generally, manual interpretation requires little, if any, specialized equipment, while digital analysis requires specialized, and often expensive, equipment. Manual interpretation is often limited to analyzing only a single channel of data or a single image at a time due to the difficulty in performing visual interpretation with multiple images. The computer environment is more amenable to handling complex images of several or many channels or from several dates. In this sense, digital analysis is useful for simultaneous analysis of many spectral bands and can process large data sets much faster than a human interpreter. Manual interpretation is a subjective process, meaning that the results will vary with different interpreters. Digital analysis is based on the manipulation of digital numbers in a computer and is thus more objective, generally resulting in more consistent results. However, determining the validity and accuracy of the results from digital processing can be difficult.

It is important to reiterate that visual and digital analyses of remote sensing imagery are not mutually exclusive. Both methods have their merits. In most cases, a mix of both methods is usually employed when analyzing imagery. In fact, the ultimate decision of the utility and relevance of the information extracted at the end of the analysis process, still must be made by humans.


Elements of Visual Interpretation

As we noted in the previous section, analysis of remote sensing imagery involves the identification of various targets in an image, and those targets may be environmental or artificial features which consist of points, lines, or areas. Targets may be defined in terms of the way they reflect or emit radiation. This radiation is measured and recorded by a sensor, and ultimately is depicted as an image product such as an air photo or a satellite image.

What makes interpretation of imagery more difficult than the everyday visual interpretation of our surroundings? For one, we lose our sense of depth when viewing a two-dimensional image, unless we can view it stereoscopically so as to simulate the third dimension of height. Indeed, interpretation benefits greatly in many applications when images are viewed in stereo, as visualization (and therefore, recognition) of targets is enhanced dramatically. Viewing objects from directly above also provides a very different perspective than what we are familiar with. Combining an unfamiliar perspective with a very different scale and lack of recognizable detail can make even the most familiar object unrecognizable in an image. Finally, we are used to seeing only the visible wavelengths, and the imaging of wavelengths outside of this window is more difficult for us to comprehend.

Recognizing targets is the key to interpretation and information extraction. Observing the differences between targets and their backgrounds involves comparing different targets based on any, or all, of the visual elements of tone, shape, size, pattern, texture, shadow, and association. Visual interpretation using these elements is often a part of our daily lives, whether we are conscious of it or not. Examining satellite images on the weather report, or following high speed chases by views from a helicopter are all familiar examples of visual image interpretation. Identifying targets in remotely sensed images based on these visual elements allows us to further interpret and analyze. The nature of each of these interpretation elements is described below, along with an image example of each.

image showing the variations in tone

Tone refers to the relative brightness or colour of objects in an image. Generally, tone is the fundamental element for distinguishing between different targets or features. Variations in tone also allows the elements of shape, texture, and pattern of objects to be distinguished.

image showing that shape can be a very distinctive clue for interpretation

Shape refers to the general form, structure, or outline of individual objects. Shape can be a very distinctive clue for interpretation. Straight edge shapes typically represent urban or agricultural (field) targets, while natural features, such as forest edges, are generally more irregular in shape, except where man has created a road or clear cuts. Farm or crop land irrigated by rotating sprinkler systems would appear as circular shapes.

size of objects in an image is a function of scale

Size of objects in an image is a function of scale. It is important to assess the size of a target relative to other objects in a scene, as well as the absolute size, to aid in the interpretation of that target. A quick approximation of target size can direct interpretation to an appropriate result more quickly. For example, if an interpreter had to distinguish zones of land use, and had identified an area with a number of buildings in it, large buildings such as factories or warehouses would suggest commercial property, whereas small buildings would indicate residential use.

pattern refers to the spatial arrangement of visibly discernible objectsPattern refers to the spatial arrangement of visibly discernible objects. Typically an orderly repetition of similar tones and textures will produce a distinctive and ultimately recognizable pattern. Orchards with evenly spaced trees, and urban streets with regularly spaced houses are good examples of pattern.

Texture refers to the arrangement and frequency of tonal variation in particular areas of an imageTexture refers to the arrangement and frequency of tonal variation in particular areas of an image. Rough textures would consist of a mottled tone where the grey levels change abruptly in a small area, whereas smooth textures would have very little tonal variation. Smooth textures are most often the result of uniform, even surfaces, such as fields, asphalt, or grasslands. A target with a rough surface and irregular structure, such as a forest canopy, results in a rough textured appearance. Texture is one of the most important elements for distinguishing features in radar imagery.

Shadows may provide an idea of the profile and relative height of a target or targets which may make identification easierShadow is also helpful in interpretation as it may provide an idea of the profile and relative height of a target or targets which may make identification easier. However, shadows can also reduce or eliminate interpretation in their area of influence, since targets within shadows are much less (or not at all) discernible from their surroundings. Shadow is also useful for enhancing or identifying topography and landforms, particularly in radar imagery.

Association takes into account the relationship between other recognizable objects or features in proximity to the target of interestAssociation takes into account the relationship between other recognizable objects or features in proximity to the target of interest. The identification of features that one would expect to associate with other features may provide information to facilitate identification. In the example given above, commercial properties may be associated with proximity to major transportation routes, whereas residential areas would be associated with schools, playgrounds, and sports fields. In our example, a lake is associated with boats, a marina, and adjacent recreational land.

Did you know?

Sometimes the 'impression' that a buried artifact, such as an ancient fort foundation, leaves on the surface, can be detected and identified

"...What will they think of next ?!..."

Remote sensing (image interpretation) has been used for archeological investigations. Sometimes the 'impression' that a buried artifact, such as an ancient fort foundation, leaves on the surface, can be detected and identified. That surface impression is typically very subtle, so it helps to know the general area to be searched and the nature of the feature being sought. It is also useful if the surface has not been disturbed much by human activities.


Digital Image Processing

Image analysis system

In today's world of advanced technology where most remote sensing data are recorded in digital format, virtually all image interpretation and analysis involves some element of digital processing. Digital image processing may involve numerous procedures including formatting and correcting of the data, digital enhancement to facilitate better visual interpretation, or even automated classification of targets and features entirely by computer. In order to process remote sensing imagery digitally, the data must be recorded and available in a digital form suitable for storage on a computer tape or disk. Obviously, the other requirement for digital image processing is a computer system, sometimes referred to as an image analysis system, with the appropriate hardware and software to process the data. Several commercially available software systems have been developed specifically for remote sensing image processing and analysis.

For discussion purposes, most of the common image processing functions available in image analysis systems can be categorized into the following four categories:

  • Preprocessing
  • Image Enhancement
  • Image Transformation
  • Image Classification and Analysis

Preprocessing functions involve those operations that are normally required prior to the main data analysis and extraction of information, and are generally grouped as radiometric or geometric corrections. Radiometric corrections include correcting the data for sensor irregularities and unwanted sensor or atmospheric noise, and converting the data so they accurately represent the reflected or emitted radiation measured by the sensor. Geometric corrections include correcting for geometric distortions due to sensor-Earth geometry variations, and conversion of the data to real world coordinates (e.g. latitude and longitude) on the Earth's surface.

 image enhancement

The objective of the second group of image processing functions grouped under the term of image enhancement, is solely to improve the appearance of the imagery to assist in visual interpretation and analysis. Examples of enhancement functions include contrast stretching to increase the tonal distinction between various features in a scene, and spatial filtering to enhance (or suppress) specific spatial patterns in an image.

Image transformations are operations similar in concept to those for image enhancement. However, unlike image enhancement operations which are normally applied only to a single channel of data at a time, image transformations usually involve combined processing of data from multiple spectral bands. Arithmetic operations (i.e. subtraction, addition, multiplication, division) are performed to combine and transform the original bands into "new" images which better display or highlight certain features in the scene. We will look at some of these operations including various methods of spectral or band ratioing, and a procedure called principal components analysis which is used to more efficiently represent the information in multichannel imagery.

data classification

Image classification and analysis operations are used to digitally identify and classify pixels in the data. Classification is usually performed on multi-channel data sets (A) and this process assigns each pixel in an image to a particular class or theme (B) based on statistical characteristics of the pixel brightness values. There are a variety of approaches taken to perform digital classification. We will briefly describe the two generic approaches which are used most often, namely supervised and unsupervised classification.

In the following sections we will describe each of these four categories of digital image processing functions in more detail.

Did you know?

"...our standard operating procedure is..."

... the remote sensing industry and those associated with it have attempted to standardize the way digital remote sensing data are formatted in order to make the exchange of data easier and to standardize the way data can be read into different image analysis systems. The Committee on Earth Observing Satellites (CEOS) have specified this format which is widely used around the world for recording and exchanging data.



Pre-processing operations, sometimes referred to as image restoration and rectification, are intended to correct for sensor- and platform-specific radiometric and geometric distortions of data. Radiometric corrections may be necessary due to variations in scene illumination and viewing geometry, atmospheric conditions, and sensor noise and response. Each of these will vary depending on the specific sensor and platform used to acquire the data and the conditions during data acquisition. Also, it may be desirable to convert and/or calibrate the data to known (absolute) radiation or reflectance units to facilitate comparison between data.

Mosaic multiple images from a single sensor

Variations in illumination and viewing geometry between images (for optical sensors) can be corrected by modeling the geometric relationship and distance between the area of the Earth's surface imaged, the sun, and the sensor. This is often required so as to be able to more readily compare images collected by different sensors at different dates or times, or to mosaic multiple images from a single sensor while maintaining uniform illumination conditions from scene to scene.

Observed brightness values

As we learned in Chapter 1, scattering of radiation occurs as it passes through and interacts with the atmosphere. This scattering may reduce, or attenuate, some of the energy illuminating the surface. In addition, the atmosphere will further attenuate the signal propagating from the target to the sensor. Various methods of atmospheric correction can be applied ranging from detailed modeling of the atmospheric conditions during data acquisition, to simple calculations based solely on the image data. An example of the latter method is to examine the observed brightness values (digital numbers), in an area of shadow or for a very dark object (such as a large clear lake - A) and determine the minimum value (B). The correction is applied by subtracting the minimum observed value, determined for each specific band, from all pixel values in each respective band. Since scattering is wavelength dependent (Chapter 1), the minimum values will vary from band to band. This method is based on the assumption that the reflectance from these features, if the atmosphere is clear, should be very small, if not zero. If we observe values much greater than zero, then they are considered to have resulted from atmospheric scattering.

striping or banding

Noise in an image may be due to irregularities or errors that occur in the sensor response and/or data recording and transmission. Common forms of noise include systematic striping or banding and dropped lines. Both of these effects should be corrected before further enhancement or classification is performed. Striping was common in early Landsat MSS data due to variations and drift in the response over time of the six MSS detectors. The "drift" was different for each of the six detectors, causing the same brightness to be represented differently by each detector. The overall appearance was thus a 'striped' effect. The corrective process made a relative correction among the six sensors to bring their apparent values in line with each other. Dropped lines occur when there are systems errors which result in missing or defective data along a scan line. Dropped lines are normally 'corrected' by replacing the line with the pixel values in the line above or below, or with the average of the two.

Dropped lines

For many quantitative applications of remote sensing data, it is necessary to convert the digital numbers to measurements in units which represent the actual reflectance or emittance from the surface. This is done based on detailed knowledge of the sensor response and the way in which the analog signal (i.e. the reflected or emitted radiation) is converted to a digital number, called analog-to-digital (A-to-D) conversion. By solving this relationship in the reverse direction, the absolute radiance can be calculated for each pixel, so that comparisons can be accurately made over time and between different sensors.

In section 2.10 in Chapter 2, we learned that all remote sensing imagery are inherently subject to geometric distortions. These distortions may be due to several factors, including: the perspective of the sensor optics; the motion of the scanning system; the motion of the platform; the platform altitude, attitude, and velocity; the terrain relief; and, the curvature and rotation of the Earth. Geometric corrections are intended to compensate for these distortions so that the geometric representation of the imagery will be as close as possible to the real world. Many of these variations are systematic, or predictable in nature and can be accounted for by accurate modeling of the sensor and platform motion and the geometric relationship of the platform with the Earth. Other unsystematic, or random, errors cannot be modeled and corrected in this way. Therefore, geometric registration of the imagery to a known ground coordinate system must be performed.

Geometric registration process

The geometric registration process involves identifying the image coordinates (i.e. row, column) of several clearly discernible points, called ground control points (or GCPs), in the distorted image (A - A1 to A4), and matching them to their true positions in ground coordinates (e.g. latitude, longitude). The true ground coordinates are typically measured from a map (B - B1 to B4), either in paper or digital format. This is image-to-map registration. Once several well-distributed GCP pairs have been identified, the coordinate information is processed by the computer to determine the proper transformation equations to apply to the original (row and column) image coordinates to map them into their new ground coordinates. Geometric registration may also be performed by registering one (or more) images to another image, instead of to geographic coordinates. This is called image-to-image registration and is often done prior to performing various image transformation procedures, which will be discussed in section 4.6, or for multitemporal image comparison.

Nearest neighbour

In order to actually geometrically correct the original distorted image, a procedure called resampling is used to determine the digital values to place in the new pixel locations of the corrected output image. The resampling process calculates the new pixel values from the original digital pixel values in the uncorrected image. There are three common methods for resampling: nearest neighbour, bilinear interpolation, and cubic convolution. Nearest neighbour resampling uses the digital value from the pixel in the original image which is nearest to the new pixel location in the corrected image. This is the simplest method and does not alter the original values, but may result in some pixel values being duplicated while others are lost. This method also tends to result in a disjointed or blocky image appearance.

Bilinear interpolation

Bilinear interpolation resampling takes a weighted average of four pixels in the original image nearest to the new pixel location. The averaging process alters the original pixel values and creates entirely new digital values in the output image. This may be undesirable if further processing and analysis, such as classification based on spectral response, is to be done. If this is the case, resampling may best be done after the classification process. Cubic convolution resampling goes even further to calculate a distance weighted average of a block of sixteen pixels from the original image which surround the new output pixel location. As with bilinear interpolation, this method results in completely new pixel values. However, these two methods both produce images which have a much sharper appearance and avoid the blocky appearance of the nearest neighbour method.

Cubic convolution



Image Enhancement

Enhancements are used to make it easier for visual interpretation and understanding of imagery. The advantage of digital imagery is that it allows us to manipulate the digital pixel values in an image. Although radiometric corrections for illumination, atmospheric influences, and sensor characteristics may be done prior to distribution of data to the user, the image may still not be optimized for visual interpretation. Remote sensing devices, particularly those operated from satellite platforms, must be designed to cope with levels of target/background energy which are typical of all conditions likely to be encountered in routine use. With large variations in spectral response from a diverse range of targets (e.g. forest, deserts, snowfields, water, etc.) no generic radiometric correction could optimally account for and display the optimum brightness range and contrast for all targets. Thus, for each application and each image, a custom adjustment of the range and distribution of brightness values is usually necessary.

Image histogramIn raw imagery, the useful data often populates only a small portion of the available range of digital values (commonly 8 bits or 256 levels). Contrast enhancement involves changing the original values so that more of the available range is used, thereby increasing the contrast between targets and their backgrounds. The key to understanding contrast enhancements is to understand the concept of an image histogram. A histogram is a graphical representation of the brightness values that comprise an image. The brightness values (i.e. 0-255) are displayed along the x-axis of the graph. The frequency of occurrence of each of these values in the image is shown on the y-axis.

Linear contrast stretch

By manipulating the range of digital values in an image, graphically represented by its histogram, we can apply various enhancements to the data. There are many different techniques and methods of enhancing contrast and detail in an image; we will cover only a few common ones here. The simplest type of enhancement is a linear contrast stretch. This involves identifying lower and upper bounds from the histogram (usually the minimum and maximum brightness values in the image) and applying a transformation to stretch this range to fill the full range. In our example, the minimum value (occupied by actual data) in the histogram is 84 and the maximum value is 153. These 70 levels occupy less than one-third of the full 256 levels available. A linear stretch uniformly expands this small range to cover the full range of values from 0 to 255. This enhances the contrast in the image with light toned areas appearing lighter and dark areas appearing darker, making visual interpretation much easier. This graphic illustrates the increase in contrast in an image before (left) and after (right) a linear contrast stretch.

Contrast in an image before (left) and after (right) a linear contrast stretch 

histogram-equalized stretch

A uniform distribution of the input range of values across the full range may not always be an appropriate enhancement, particularly if the input range is not uniformly distributed. In this case, a histogram-equalized stretch may be better. This stretch assigns more display values (range) to the frequently occurring portions of the histogram. In this way, the detail in these areas will be better enhanced relative to those areas of the original histogram where values occur less frequently. In other cases, it may be desirable to enhance the contrast in only a specific portion of the histogram. For example, suppose we have an image of the mouth of a river, and the water portions of the image occupy the digital values from 40 to 76 out of the entire image histogram. If we wished to enhance the detail in the water, perhaps to see variations in sediment load, we could stretch only that small portion of the histogram represented by the water (40 to 76) to the full grey level range (0 to 255). All pixels below or above these values would be assigned to 0 and 255, respectively, and the detail in these areas would be lost. However, the detail in the water would be greatly enhanced.

filtering procedureSpatial filtering encompasses another set of digital processing functions which are used to enhance the appearance of an image. Spatial filters are designed to highlight or suppress specific features in an image based on their spatial frequency. Spatial frequency is related to the concept of image texture, which we discussed in section 4.2. It refers to the frequency of the variations in tone that appear in an image. "Rough" textured areas of an image, where the changes in tone are abrupt over a small area, have high spatial frequencies, while "smooth" areas with little variation in tone over several pixels, have low spatial frequencies. A common filtering procedure involves moving a 'window' of a few pixels in dimension (e.g. 3x3, 5x5, etc.) over each pixel in the image, applying a mathematical calculation using the pixel values under that window, and replacing the central pixel with the new value. The window is moved along in both the row and column dimensions one pixel at a time and the calculation is repeated until the entire image has been filtered and a "new" image has been generated. By varying the calculation performed and the weightings of the individual pixels in the filter window, filters can be designed to enhance or suppress different types of features.

 Low-pass filter

A low-pass filter is designed to emphasize larger, homogeneous areas of similar tone and reduce the smaller detail in an image. Thus, low-pass filters generally serve to smooth the appearance of an image. Average and median filters, often used for radar imagery (and described in Chapter 3), are examples of low-pass filters. High-pass filters do the opposite and serve to sharpen the appearance of fine detail in an image. One implementation of a high-pass filter first applies a low-pass filter to an image and then subtracts the result from the original, leaving behind only the high spatial frequency information. Directional, or edge detection filters are designed to highlight linear features, such as roads or field boundaries. These filters can also be designed to enhance features which are oriented in specific directions. These filters are useful in applications such as geology, for the detection of linear geologic structures.

 Directional, or edge detection filters

Did you know?

An image 'enhancement' is basically anything that makes it easier or better to visually interpret an image. In some cases, like 'low-pass filtering', the enhanced image can actually look worse than the original, but such an enhancement was likely performed to help the interpreter see low spatial frequency features among the usual high frequency clutter found in an image. Also, an enhancement is performed for a specific application. This enhancement may be inappropriate for another purpose, which would demand a different type of enhancement.


Image Transformations

Image transformations typically involve the manipulation of multiple bands of data, whether from a single multispectral image or from two or more images of the same area acquired at different times (i.e. multitemporal image data). Either way, image transformations generate "new" images from two or more sources which highlight particular features or properties of interest, better than the original input images.

Image subtractionBasic image transformations apply simple arithmetic operations to the image data. Image subtraction is often used to identify changes that have occurred between images collected on different dates. Typically, two images which have been geometrically registered (see section 4.4), are used with the pixel (brightness) values in one image (1) being subtracted from the pixel values in the other (2). Scaling the resultant image (3) by adding a constant (127 in this case) to the output values will result in a suitable 'difference' image. In such an image, areas where there has been little or no change (A) between the original images, will have resultant brightness values around 127 (mid-grey tones), while those areas where significant change has occurred (B) will have values higher or lower than 127 - brighter or darker depending on the 'direction' of change in reflectance between the two images . This type of image transform can be useful for mapping changes in urban development around cities and for identifying areas where deforestation is occurring, as in this example.

Image division or spectral ratioing is one of the most common transforms applied to image data. Image ratioing serves to highlight subtle variations in the spectral responses of various surface covers. By ratioing the data from two different spectral bands, the resultant image enhances variations in the slopes of the spectral reflectance curves between the two different spectral ranges that may otherwise be masked by the pixel brightness variations in each of the bands. The following example illustrates the concept of spectral ratioing. Healthy vegetation reflects strongly in the near-infrared portion of the spectrum while absorbing strongly in the visible red. Other surface types, such as soil and water, show near equal reflectances in both the near-infrared and red portions. Thus, a ratio image of Landsat MSS Band 7 (Near-Infrared - 0.8 to 1.1 mm) divided by Band 5 (Red - 0.6 to 0.7 mm) would result in ratios much greater than 1.0 for vegetation, and ratios around 1.0 for soil and water. Thus the discrimination of vegetation from other surface cover types is significantly enhanced. Also, we may be better able to identify areas of unhealthy or stressed vegetation, which show low near-infrared reflectance, as the ratios would be lower than for healthy green vegetation.

Normalized Difference Vegetation Index (NDVI) Another benefit of spectral ratioing is that, because we are looking at relative values (i.e. ratios) instead of absolute brightness values, variations in scene illumination as a result of topographic effects are reduced. Thus, although the absolute reflectances for forest covered slopes may vary depending on their orientation relative to the sun's illumination, the ratio of their reflectances between the two bands should always be very similar. More complex ratios involving the sums of and differences between spectral bands for various sensors, have been developed for monitoring vegetation conditions. One widely used image transform is the Normalized Difference Vegetation Index (NDVI) which has been used to monitor vegetation conditions on continental and global scales using the Advanced Very High Resolution Radiometer (AVHRR) sensor onboard the NOAA series of satellites (see Chapter 2, section 2.11).

Principal components analysisDifferent bands of multispectral data are often highly correlated and thus contain similar information. For example, Landsat MSS Bands 4 and 5 (green and red, respectively) typically have similar visual appearances since reflectances for the same surface cover types are almost equal. Image transformation techniques based on complex processing of the statistical characteristics of multi-band data sets can be used to reduce this data redundancy and correlation between bands. One such transform is called principal components analysis. The objective of this transformation is to reduce the dimensionality (i.e. the number of bands) in the data, and compress as much of the information in the original bands into fewer bands. The "new" bands that result from this statistical procedure are called components. This process attempts to maximize (statistically) the amount of information (or variance) from the original data into the least number of new components. As an example of the use of principal components analysis, a seven band Thematic Mapper (TM) data set may be transformed such that the first three principal components contain over 90 percent of the information in the original seven bands. Interpretation and analysis of these three bands of data, combining them either visually or digitally, is simpler and more efficient than trying to use all of the original seven bands. Principal components analysis, and other complex transforms, can be used either as an enhancement technique to improve visual interpretation or to reduce the number of bands to be used as input to digital classification procedures, discussed in the next section.


Image Classification and Analysis

Digital image classification

A human analyst attempting to classify features in an image uses the elements of visual interpretation (discussed in section 4.2) to identify homogeneous groups of pixels which represent various features or land cover classes of interest. Digital image classification uses the spectral information represented by the digital numbers in one or more spectral bands, and attempts to classify each individual pixel based on this spectral information. This type of classification is termed spectral pattern recognition. In either case, the objective is to assign all pixels in the image to particular classes or themes (e.g. water, coniferous forest, deciduous forest, corn, wheat, etc.). The resulting classified image is comprised of a mosaic of pixels, each of which belong to a particular theme, and is essentially a thematic "map" of the original image.

When talking about classes, we need to distinguish between information classes and spectral classes. Information classes are those categories of interest that the analyst is actually trying to identify in the imagery, such as different kinds of crops, different forest types or tree species, different geologic units or rock types, etc. Spectral classes are groups of pixels that are uniform (or near-similar) with respect to their brightness values in the different spectral channels of the data. The objective is to match the spectral classes in the data to the information classes of interest. Rarely is there a simple one-to-one match between these two types of classes. Rather, unique spectral classes may appear which do not necessarily correspond to any information class of particular use or interest to the analyst. Alternatively, a broad information class (e.g. forest) may contain a number of spectral sub-classes with unique spectral variations. Using the forest example, spectral sub-classes may be due to variations in age, species, and density, or perhaps as a result of shadowing or variations in scene illumination. It is the analyst's job to decide on the utility of the different spectral classes and their correspondence to useful information classes.

Supervised classification

Common classification procedures can be broken down into two broad subdivisions based on the method used: supervised classification and unsupervised classification. In a supervised classification, the analyst identifies in the imagery homogeneous representative samples of the different surface cover types (information classes) of interest. These samples are referred to as training areas. The selection of appropriate training areas is based on the analyst's familiarity with the geographical area and their knowledge of the actual surface cover types present in the image. Thus, the analyst is "supervising" the categorization of a set of specific classes. The numerical information in all spectral bands for the pixels comprising these areas are used to "train" the computer to recognize spectrally similar areas for each class. The computer uses a special program or algorithm (of which there are several variations), to determine the numerical "signatures" for each training class. Once the computer has determined the signatures for each class, each pixel in the image is compared to these signatures and labeled as the class it most closely "resembles" digitally. Thus, in a supervised classification we are first identifying the information classes which are then used to determine the spectral classes which represent them.

unsupervised classification

Unsupervised classification in essence reverses the supervised classification process. Spectral classes are grouped first, based solely on the numerical information in the data, and are then matched by the analyst to information classes (if possible). Programs, called clustering algorithms, are used to determine the natural (statistical) groupings or structures in the data. Usually, the analyst specifies how many groups or clusters are to be looked for in the data. In addition to specifying the desired number of classes, the analyst may also specify parameters related to the separation distance among the clusters and the variation within each cluster. The final result of this iterative clustering process may result in some clusters that the analyst will want to subsequently combine, or clusters that should be broken down further - each of these requiring a further application of the clustering algorithm. Thus, unsupervised classification is not completely without human intervention. However, it does not start with a pre-determined set of classes as in a supervised classification.

Did you know?

"...this image has such lovely texture, don't you think?..."

...texture was identified as one of the key elements of visual interpretation (section 4.2), particularly for radar image interpretation. Digital texture classifiers are also available and can be an alternative (or assistance) to spectral classifiers. They typically perform a "moving window" type of calculation, similar to those for spatial filtering, to estimate the "texture" based on the variability of the pixel values under the window. Various textural measures can be calculated to attempt to discriminate between and characterize the textural properties of different features.


Data Integration and Analysis
Previous (Image Classification and Analysis) Index (Introduction) Next (Endnotes)

showing merging of data from multiple sources in an effort to extract better and/or more information

In the early days of analog remote sensing when the only remote sensing data source was aerial photography, the capability for integration of data from different sources was limited. Today, with most data available in digital format from a wide array of sensors, data integration is a common method used for interpretation and analysis. Data integration fundamentally involves the combining or merging of data from multiple sources in an effort to extract better and/or more information. This may include data that are multitemporal, multiresolution, multisensor, or multi-data type in nature.

Multitemporal data integration has already been alluded to in section 4.6 when we discussed image subtraction. Imagery collected at different times is integrated to identify areas of change. Multitemporal change detection can be achieved through simple methods such as these, or by other more complex approaches such as multiple classification comparisons or classifications using integrated multitemporal data sets. Multiresolution data merging is useful for a variety of applications. The merging of data of a higher spatial resolution with data of lower resolution can significantly sharpen the spatial detail in an image and enhance the discrimination of features. SPOT data are well suited to this approach as the 10 metre panchromatic data can be easily merged with the 20 metre multispectral data. Additionally, the multispectral data serve to retain good spectral resolution while the panchromatic data provide the improved spatial resolution.

Multispectral optical data with radar imageryData from different sensors may also be merged, bringing in the concept of multisensor data fusion. An excellent example of this technique is the combination of multispectral optical data with radar imagery. These two diverse spectral representations of the surface can provide complementary information. The optical data provide detailed spectral information useful for discriminating between surface cover types, while the radar imagery highlights the structural detail in the image.

three-dimensional perspective viewsApplications of multisensor data integration generally require that the data be geometrically registered, either to each other or to a common geographic coordinate system or map base. This also allows other ancillary (supplementary) data sources to be integrated with the remote sensing data. For example, elevation data in digital form, called Digital Elevation or Digital Terrain Models (DEMs/DTMs), may be combined with remote sensing data for a variety of purposes. DEMs/DTMs may be useful in image classification, as effects due to terrain and slope variability can be corrected, potentially increasing the accuracy of the resultant classification. DEMs/DTMs are also useful for generating three-dimensional perspective views by draping remote sensing imagery over the elevation data, enhancing visualization of the area imaged.

Geographical Information System (GIS)Combining data of different types and from different sources, such as we have described above, is the pinnacle of data integration and analysis. In a digital environment where all the data sources are geometrically registered to a common geographic base, the potential for information extraction is extremely wide. This is the concept for analysis within a digital Geographical Information System (GIS) database. Any data source which can be referenced spatially can be used in this type of environment. A DEM/DTM is just one example of this kind of data. Other examples could include digital maps of soil type, land cover classes, forest species, road networks, and many others, depending on the application. The results from a classification of a remote sensing data set in map format, could also be used in a GIS as another data source to update existing map data. In essence, by analyzing diverse data sets together, it is possible to extract better and more accurate information in a synergistic manner than by using a single data source alone. There are a myriad of potential applications and analyses possible for many applications. In the next and final chapter, we will look at examples of various applications of remote sensing data, many involving the integration of data from different