Processing and Classification of Remotely Sensed Data - Completely Remote Sensing Tutorial -
Processing and Classification of Remotely Sensed Data

Processing and Classification of Remotely Sensed Data; Pattern Recognition; Approaches to Data/Image Interpretation

This part of the Introduction, which has centered on principles and theory underlying the practice of remote sensing, closes with several guidelines on how data are processed into a variety of images that can be classified. This page is a preview of a more extended treatment in Section 1; there is some redundancy relative to preceding pages for the sake of logical presentation. The ability to extract information from the data and interpret this will depend not only on the capabilities of the sensors used but on how those data are then handled to convert the raw values into image displays that improve the products for analysis and applications (and ultimately on the knowledge skills of the human interpreter). The key to a favorable outcome lies in the methods of image processing. The methods rely on obtaining good approximations of the aforementioned spectral response curves and tying these into the spatial context of objects and classes making up the scene.

In the sets of spectral curves shown below (made on site using a portable field spectrometer), it is clear that the spectral response for common inorganic materials is distinct from the several vegetation types. The first (left or top) spectral signatures indicate a gradual rise in reflectance with increasing wavelengths for the indicated materials. Concrete, being light-colored and bright, has a notably higher average than dark asphalt. The other materials fall in between. The shingles are probably bluish, in color as suggested by a rise in reflectance from about 0.4 to 0.5 µm and a flat response in the remainder of the visible (0.4 - 0.7 µm) light region. The second curves (on the right or bottom) indicate most vegetation types are very similar in response between 0.3 - 0.5 µm; show moderate variations in the 0.5 - 0.6 µm interval; and and rise abruptly to display maximum variability (hence optimum discrimination) in the 0.7 - 0.9 µm range, thereafter undergoing a gradual drop beyond about 1.1 µm.
Spectral Curve (A) Diagram: Non-vegetated Land Areas.

Spectral Curve (B): Vegetated Land Areas

As we have seen, most sensors on spacecraft do not measure the spectral range(s) they monitor fast enough to produce a continuous spectral curve. Instead, they divide the spectral curves into intervals or bands. Each band contains wavelength-dependent input from what is shown in the continuous spectral curve, which varies in intensity (ordinate of the diagram plots), which is combined into a single value - the averaged intensity values over the spectral range present in the interval.

The spectral measurements represented in each band depend on the interactions between the incident radiation and the atomic and molecular structures of the material (pure pixel) or materials (mixed pixel) present on the ground. These interactions lead to a reflected (for wavelengths involving the visible and near-infrared) signal, which changes some as it returns through the atmosphere. The measurements also depend on the nature of the detector system's response in the sensor. Remote sensing experts can use spectral measurements to describe an object by its composition. This is accomplished either by reference to independently determined spectral signatures or, more frequently, by identifying the object and extracting its spectral response in each of the (broad to narrow) bands used (this is complicated somewhat by the degradation of the response by mixed pixels). In the second case, the signature is derived from the scene itself, provided one can recognize the class by some spatial attribute. The wave-dependent reflectances from limited sampling points (individual or small groups of pixels) become the standards to which other pixels are compared (if matched closely, it is assumed those additional pixels represent the same material or class); this approach is called the training site method.

In practice, objects and features on Earth's surface are described more as classes than as materials per se. Consider, for instance, the material concrete. It is used in roadways, parking lots, swimming pools, buildings, and other structural units, each of which might be treated as a separate class. We can subdivide vegetation in a variety of ways: trees, crops, grasslands, lake bloom algae, etc. Finer subdivisions are permissible, by classifying trees as deciduous or evergreen, or deciduous trees into oak, maple, hickory, poplar, etc.

Two additional properties help to distinguish these various classes, some of which have the same materials; namely, shape (geometric patterns) and use or context (sometimes including geographical locations). Thus, we may assign a feature composed of concrete to the classes 'streets' and 'parking lots,' depending on whether its shape is long and narrow or more square or rectangular. Two features that have nearly identical spectral signatures for vegetation, could be assigned to the classes 'forest' and 'crops' depending on whether the area in the images has irregular or straight (often rectangular, the case for most farms) boundaries.

The task, then, of any remote sensing system is to detect radiation signals, determine their spectral character, derive appropriate signatures, and interrelate the spatial positions of the classes they represent. This ultimately leads to some type of interpretable display product, be it an image, a map, or a numerical data set, that mirrors the reality of the surface (affected by some atmospheric property[ies]) in terms of the nature and distribution of the features present in the field of view.

The determination of these classes requires that either hard copy, i.e., images, or numerical data sets be available and capable of visual or automated analysis. This is the function of image processing techniques, a subject that will be treated in considerable detail in Section 1 in which a single scene - Morro Bay on the California coast - is treated by various commonly used methods of display, enhancement, classification, and interpretation. On this page we will simply describe several of the principal operations that can be performed to show and improve an image. It should be worth your while to get a quick overview of image processing by accessing this good review on the Internet, at the Canadian Soonet site.

The starting point in scene analysis is to point out that radiances (from the ground and intervening atmosphere) measured by the sensors (from hand held digital cameras to distant orbiting satellites) vary in intensity. Thus reflected light at some wavelength, or span of wavelengths (spectral region), can range in its intensity from very low values (dark in an image) because few photons are received to very bright (light toned) because of the high reflectances representing much more photons. Each level of radiance can be assigned a quantitative value (commonly as a fraction of 1 or as a percentage of the total radiance that can be handled by the sensor's range). The values are restated as digital numbers (DNs) that consist of equal increments over a range (commonly from 0 to 255; from minimum to maximum measured radiance). As used to engender an image, a DN is assigned some level of "gray" (from all black to all white, and shades of gray in between). When the pixel array acquired by the sensor is processed to show each pixel in its proper relative position and then the DN for the pixel is given a gray tone, a standard black and white image results.

The simplest manipulation of these DNs is to increase or decrease the range of DNs present in the scene and assign this new range the gray levels available within the range limit. This is called contrast stretching. For example, if the range of the majority (say, 90%) of DN values is 40 to 110, this might be stretched out to 0 to 200, extending more into the darkest and the lighter tones in a black and white image. We show several examples here, using a Landsat subscene from the area around Harrisburg, PA that will be examined in detail in the Exam that closes Section 1.

See below for identification of each depiction.

The upper left panel is a "raw" (unstretched) rendition of Landsat MSS band 5. A linear stretch appears in the upper right and a non-linear (departure from a straight line plot of increasing DN values) in the lower left. A special stretch known as Piecewise Linear is shown in the lower right.

Another essential ingredient in most remote sensing images is color. While variations in black and white imagery can be very informative, and were the norm in the earlier aerial photographs (color was often too expensive), the number of different gray tones that the eye can separate is limited to about 20-30 steps (out of a maximum of ~250) on a contrast scale. On the other hand, the eye can distinguish 20,000 or more color tints, so we can discern small but often important variations within the target materials or classes can be discerned. Liberal use of color in the illustrations found throughout this Tutorial takes advantage of this capability; unlike most textbooks, in which color is restricted owing to costs. For a comprehensive review of how the human eye functions to perceive gray and color levels, consult Chapter 2 in Drury, S.A., Image Interpretation in Geology, 1987, Allen & Unwin.

Any three bands (each covering a spectral range or interval) from a multispectral set, either unstretched or stretched, can be combined using optical display devices, photographic methods, or computer-generated imagery to produce a color composite (simple color version in natural [same as reality] colors, or quasi-natural [choice of colors approximates reality but departs somewhat from actual tones], or false color). Here is the Harrisburg scene in conventional false color (vegetation will appear red because the band used displays vegetation in light [bright] tones is projected through a red filter as the color composite is generated):

False color Landsat subscene around Harrisburg, PA.

New kinds of images can be produced by making special data sets using computer processing programs. For example, one can divide the DNs of one band by those of another at each corresponding pixel site. This produces a band ratio image. Shown here is Landsat MSS Band 7 divided by Band 4, giving a new set of numbers that cluster around 1; these numbers are then expanded by a stretching program and assigned gray levels. In this scene, growing vegetation is shown in light gray tones.

7/4 ratio image of the Harrisburg subscene.

Data from the several bands that are set up from spectral data in the visible and Near-IR tend to be varyingly correlated for some classes. This correlation can be minimized by a reprocessing technique known as Principal Components Analysis. New PCA bands are produced, each containing some information not found in the others. This image shows the first 4 components of a PCA product for Harrisburg; the upper left (Component 1) contains much more decorrelated information than the last image at the lower right (Component 4).

Principal Component images 1 through 4 for the Harrisburg subscene.

New color images can be made from sets of three band ratios or three Principal Components. The color patterns will be different from natural or false color versions. Interpretation can be conducted either by visual means, using the viewer's experience, and/or aided by automated interpretation programs, such as the many available in a computer-based Pattern Recognition procedure (see below on this page).

A chief use of remote sensing data is in classifying the myriad of features in a scene (usually presented as an image) into meaningful categories or classes. The image then becomes a thematic map (the theme is selectable, e.g., land use; geology; vegetation types; rainfall). In Section 1 of the Tutorial we explain how to interpret an image using an aerial or space image to derive a thematic map. This is done by creating an unsupervised classification when features are separated solely on their spectral properties and a supervised classification when we use some prior or acquired knowledge of the classes in a scene in setting up training sites to estimate and identify the spectral characteristics of each class. A supervised classification of the Harrisburg subscene shows the distribution of the named (identified) classes, as these were established by the investigator who knew their nature from field observations. In conducting the classification, representative pixels of each class were lumped into one or more training sites that were manipulated statistically to compare unknown class pixels to these site references:

Supervised classification of the Harrisburg subscene.

We mention another topic that is integral to effective interpretation and classification. This is often cited as reference or ancillary data but is more commonly known as ground truth. Under this heading are various categories: maps and databases, test sites, field and laboratory measurements, and most importantly actual onsite visits to the areas being studied by remote sensing. This last has two main facets: 1) to identify what is there in terms of classes or materials so as to set up training sites for classification, and 2) to revisit parts of a classified image area to verify the accuracy of identification in places not visited. We will go into ground truth in more detail in the first half of Section 13; for a quick insight switch now to page 13-1.

Another important topic - Pattern Recognition (PR)- will be looked at briefly on this page. Pattern Recognition is closely related (allied) to Remote Sensing and warrants a few comments here. Strangely, a search on the Internet has failed to find any really adequate site that effectively provides an overview or any definitive explanatory illustrations. Two sites that offer some insight are 1)Wikipedia, which presents a brief overview, and 2) McGill University, which contains an impressive list of the scope of topics one should be familiar with to understand and practically use Pattern Recognition methodologies. Elsewhere on the Net this very general definition of PR was found:

Pattern Recognition: The term refers to techniques for classifying a set of objects into a number of distinct classes by considering similarities of objects belonging to the same class and the dissimilarities of objects belonging to different classes. As an application in computer science, pattern recognition involves the imposition of identity on input data, such as speech, images, or a stream of text, by the recognition and delineation of patterns it contains and their relationships. Stages in pattern recognition may depend on measurement of the object to identify distinguishing attributes, extraction of features for the defining attributes, and comparison with known patterns to determine a match or mismatch. Pattern recognition has extensive application in astronomy, medicine, robotics, and remote sensing by satellites.

Pattern Recognition (sometimes referred alternately as "Machine Learning" or "Data Mining") uses spectral, spatial, contextual, or acoustic inputs to extract specific information from visual or sonic data sets. You are probably most familiar with the Optical Character Recognition (OCR) technique that reads a pattern of straight lines of different thicknesses called the bar code:

Example of a bar code pattern.

An optical scanner reads the set of lines and searches a data base for this exact pattern. A computer program compares patterns, locates this one, and ties it into a database that contain information relevant to this specific pattern (in a grocery store, for example, this would be the current price of the product on which the bar code has been included on, say, the package).

One application of Pattern Recognition familiar to many readers of this Tutorial is described by the term "facial recognition". The technique depends on selecting key metrics of features on the human face that can be specified by size, shape, color, spacing, etc. This information is digitized and entered into a data bank. When a new face is encountered, its metrics are compared to the reference set of faces to determine the extent of match. A summary of face recognition is given at the HowStuffWorks Internet site. Aspects of the concept appear in these illustrations:

Face recognition.
Face recognition.

In the 21st century era of terrorism and crime, surveillance of cities, airports, and other modes of transportation has led to emplacement of TV cameras that scan streets and other open areas. While most of these simply record what is happening in real time, which must be interpreted by human monitors, some are equipped with facial recognition and other types of pattern recognition software that provide rapid analysis of what/who is in the scene.

Other examples of PR encountered in today's technological environment include 1) security control relying on identification of an individual by recognizing a finger or hand print, or by matching a scan of the eye to a database that includes only those previously scanned and added to the base; 2) voice recognition used to perform tasks on an automated telephone call routing (alternative: push telephone button number to reach a department or service); 3) sophisticated military techniques that allow targets to be sought out and recognized by an onboard processor on a missile or "smart bomb"; 4) handwriting analysis and cryptography; 5) a feature recognition program that facilitates identification of fossils in rocks by analyzing shape and size and comparing these parameters to a data bank containing a collection of fossil images of known geometric properties; 6) classifying features, objects, and distribution patterns in a photo or equivalent image, as discussed above.

In the Remote Sensing Tutorial pattern recognition is an implicit component of the image processing techniques of Unsupervised and Supervised Classification of geographic and other spatial images acquired by remote sensing satellites, as discussed in Section 2.

Pattern Recognition is a major application field for various aspects of Artificial Intelligence and Expert Systems, Neural Networks, and Information Processing/Signal Processing (all outside the scope of coverage in this Tutorial) as well as statistical programs for decision-making (e.g., Bayesian Decision Theory). It has a definite place in remote sensing, particularly because of its effectiveness in geospatial analysis; however, it is ignored (insofar as the term Pattern Recognition per se is concerned) in most textbooks on Remote Sensing. Establishing a mutual bond between RS and PR can facilitate some modes of Classification. Pattern Recognition also plays an important role in Geographic Information Systems (GIS) (the topic is reviewed in detail in Section 15).

GIS is a fast-growing, now widely used approach to analyzing spatial (in the geographic sense) data of many kinds for the purpose of managing systems (such as forest cutting, city planning, crop planting, etc.) that require frequent, ongoing decisions. Different types of systems can have, in common, spatial references, such as geographic coordinates. Each type can be displayed as a map or layer. The different layers can be overlaid using the coordinates as a common means of registration. This is now done almost entirely on computers using software (such as marketed by the ESRI Co.) that facilitates analysis and decision making. Some hint of how this is done (described in detail in Section 15) is afforded by these two diagrams:

A series of map layers which are registered as input into GIS analysis.
Scheme for using multilayered data in determining soil erosion over time.

All these processing and classifying activities are done to lead to some sort of end results or "bottom lines". The purpose is to gain new information, derive applications, and make action decisions. For example, a Geographic Information System program will utilize a variety of data that may be gathered and processed simply to answer a question like: "Where is the best place in a region of interest to locate (site) a new electric power plant?" Both machine (usually computers) and humans are customarily involved in seeking the answer. Remote sensing data are commonly an integral part and input into GIS.

It is almost self-evident that the primary interpreter(s) will be one person or a group of humans. These must have a suitable knowledge base and adequate experience in evaluating data, solving problems, and making decisions. Where remote sensing and pattern recognition are among the "tools" used in the process, the interpreter must also be familiar with the principles and procedures underlying these technologies and some solid expertise in selecting the right data inputs, processing the data, and deriving understandable outputs in order to reach satisfactory interpretations and consequent decisions. But, with the computer age it has also become possible to have software and display programs that conduct some - perhaps the majority - of the interpretations. Yet these automated end results must ultimately be evaluated by qualified people. As the field of Artificial Intelligence develops and decision rules become more sophisticated, a greater proportion of the interpretation and evaluation can be carried out by the computer programs chosen to yield the desired information. But at some stage the human mind must interact directly.

Throughout the Tutorial, and especially in Section 1 we will encounter examples of appropriate and productive interpretation and decision making.

We will now move on to the second phase of the Introduction: the history of remote sensing with emphasis on satellite systems that employ radiance monitoring sensors.