GIS Analysis Functions


  1. Spatial Data Functions
  2. Attribute Data Functions
  3. Integrated Analysis of Spatial and Attribute Data
  4. Cartographic Modeling
  5. Connectivity Functions
  6. Output Functions

GIS analysis functions use the spatial and non-spatial attribute data to answer questions about real-world. It is the spatial analysis functions that distinguishes GIS from other information systems.

When use GIS to address real-world problems, you'll come up against the question that which analysis function you want to use and to solve the problems. In this case, you should be aware that wisely using functions will lead to high quality of the information produced from GIS and individual analysis functions must be used in the context of a complete analysis strategy. (Stan Aronoff, 1989)


1. Spatial Data Functions

Spatial data refers to information about the location and shape of, and relationships among, geographic features, usually stored as coordinates and topology. Spatial data functions are used to transform spatial data files, such as digitized map, edit them, and assess their accuracy. They are mainly concerned with the spatial data.


Format Transformations

Format is the pattern into which data are systematically arranged for use on a computer. Format transformations are used to get data into acceptable GIS format. Digital Files must be transformed into the data format used by the GIS, such as transforming from raster to vector data structure. A raster data often requires no re-formatting. A vector data often requires topology to be built from coordinate data, such as arc/node translations. Transformation can be very costly and time-consuming with poor coordinate data.


Geometric Transformations

Geometric transformations are used to assign ground coordinates to a map or data layer within the GIS or to adjust one data layer so it can be correctly overlayed on another of the same area. The procedure used to accomplish this correction is termed registration.

Two approaches are used in registration: the adjustment of absolute positions and the adjustment of relative position. Relative Position refers to the location of features in relation to a geographic coordinate system. Rubber sheeting (registration by Relative Position) is the procedure using "slave" and "master" mathematical transformations to adjust coverage features in a nonuniform manner. Links representing from- and to-locations are used to define the adjustment. It needs easily identifiable, accurate, well distributed control points. Absolute Position is the location in relation to the ground. This registration is done by individual layers. The advantage is that it does not propagate errors.


Projection Transformations

Map projection is a mathematical transformation that is used to represent a spherical surface on a flat map. The transformation assigns to each location on a spherical surface a unique location on a 2-dimensional map.

Map projections always causes some distortion: area, shape, distance, or direction distortion. GIS commonly supports several projections and has software to transform data from one projection to another. The map projections most commonly used for mapping at scales of 1:500,000 or larger in North America is the UTM(Universal Transverse Mercator) Projection. For maps of continental extent, the Albers, Lambert's Azimuthal, and Polyconic projections are commonly used.



Conflation is the procedure of reconciling the positions of corresponding features in different data layers. Conflation functions are used to reconcile these differences so that the corresponding features overlay precisely. This is important when data from several data layers are used in an analysis.



Edge matching is a procedure to adjust the position of features extending across map sheet boundaries. This function ensures that all features that cross adjacent map sheets have the same edge locations. Links are used when matching features in adjacent coverages.


Editing Functions

Editing functions are used to add, delete, and change the geographic position of features. Sliver or splinter polygons are thin polygons that are occurring along the borders of polygons following digitizing and the topological overlay of two or more coverages.

Address Matching is a mechanism for relating two files using address as the relate item. Geographic coordinates and attributes can be transferred from one address to the other. For example, a data file containing student addresses can be matched to a street coverage that contains addresses creating a point coverage of where the students live.


Line Coordinate Thinning

The Thinning function reviews all the coordinate data in a file, identifies and then removes unnecessary coordinates. Depending on scale, a number of coordinate pairs can often be significantly reduced without a perceived loss of detail

This function is used to reduce the quantity of coordinate data that must be stored by the GIS. Coordinate thinning, by reducing the number of coordinate points, reduces the size of the data file, thereby reducing the volume of data to be stored and processed in the GIS.


2. Attribute Data Functions

Attribute Data is relate to the description of the map items. It is typically stored in tabular format and linked to the feature by a user-assigned identifier (e.g., the attributes of a well might include depth and gallons per minute).


Retrieval(selective search)

Retrieval operations on the spatial and attribute data involve the selective search manipulation, and output of data without the need to modify the geographic location of features or to create new spatial entities. These operations work with the spatial elements as they were entered in the data base.

Information from database tables can be accessed directly through the map, or new maps can be created using information in the tabular database. Both graphic and tabular data must be stored in formats the computer can recognize and retrieve.



Classification is the procedure of identifying a set of features as belonging to a group and defining patterns. Some form of classification function is provided in every GIS. In a raster-based GIS, numerical values are often used to indicate classes. Classification is important because it defines patterns. One of the important functions of a GIS is to assist in recognizing new patterns.
Classification is done using single data layers, as well as with multiple data layers as part of an overlay operation.
Generalization, also called map dissolve, is the process of making a classification less detailed by combining classes. Generalization is often used to reduce the level of classification detail to make an underlying pattern more apparent.



Verification is a procedure for checking the values of attributes for all records in a database against their correct values. (Keith C. Clarke, 1997)


3. Integrated Analysis of Spatial and Attribute Data



Overlay is a GIS operation in which layers with a common, registered map base are joined on the basis of their occupation of space. (Keith C. Clarke, 1997).

The overlay function creates composite maps by combining diverse data sets. The overlay function can perform simple operations such as laying a road map over a map of local wetlands, or more sophisticated operations such as multiplying and adding map attributes of different value to determine averages and co-occurrences.

Raster and vector models differ significantly in the way overlay operations are implemented. Overlay operations are usually performed more efficiently in raster-based systems. In many GISs a hybrid approach is used that takes advantage of the capabilities of both data models. A vector-based system may implement some functions in the raster domain by performing a vector-to-raster conversion on the input data, doing the processing as a raster operation, and converting the raster result back to a vector file.

Region Wide Overlay: "Cookie Cutter Approach"

The region wide, or "cookie cutter," approach to overlay analysis allows natural features, such as forest stand boundaries or soil polygons, to become the spatial area(s) which will be analyzed on another map.

For example ( see figures above): given two data sets, forest patches and slope, what is the area-weighted average slope within each separate patch of forest? To answer this question, the GIS overlays each patch of forest from the forest patch data set onto the slope map and then calculates the area-weighted average slope for each individual forest patch.

Topological Overlay:

Co-Occurrence mapping in a vector GIS is accomplished by topological overlaying. Any number of maps may be overlayed to show features occurring at the same location. To accomplish this, the GIS first stacks maps on top of one another and finds all new intersecting lines. Second, new nodes (point features where three or more arcs, or lines, come together) are set at these new intersections. Lastly, the topologic structure of the data is rebuilt and the multifactor attributes are attached to the new area features.


Neighborhood Function

Neighborhood Function analyzes the relationship between an object and similar surrounding objects. For example, in a certain area, analysis of a kind of land use is next to what kinds of land use can be done by using this function. This type of analysis is often used in image processing. A new map is created by computing the value assigned to a location as a function of the independent values surrounding that location. Neighborhood functions are particularly valuable in evaluating the character of a local area.


Point-in-Polygon and Line-In-Polygon

Point-in-Polygon is a topological overlay procedure which determines the spatial coincidence of points and polygons. Points are assigned the attributes of the polygons within which they fall. For example, this function can be used to analyze an address and find out if it (point) is located within a certain zip code area (polygon).

Line-in-Polygon is a spatial operation in which lines in one coverage are overlaid with polygons of another coverage to determine which lines, or portions of lines, are contained within the polygons. Polygon attributes are associated with corresponding lines in the resulting line coverage. For example, this function can be used to find out who will be affected when putting in a new powerline in an area.

In a vector-based GIS, the identification of points and lines contained within a polygon area is a specialized search function. In a raster-based GIS, it is essentially an overlay operation, with the polygons in one data layer and the points and/or lines in a second data layer.


Topographic Functions

Topography refers to the surface characteristics with continuously changing value over an area such as elevations, aeromagnetics, noise levels, income levels, and pollution levels. The topography of a land surface can be represented in a GIS by digital elevation data. An alternative form of representation is the Triangulated Irregular Network or TIN used in vector-based systems.

Topographic functions are used to calculate values that describe the topography at a specific geographic location or in the vicinity of the location. The two most commonly used terrain parameters are the slope and aspect, which are calculated using the elevation data of the neighbouring points.

Slope is the measure of change in surface value over distance, expressed in degrees or as a percentage. For example, a rise of 2 meters over a distance of 100 meters describes a 2% slope with an angle of 1.15. Mathematically, slope is referred to as the first derivative of the surface. The maximum slope is termed the gradient. In a raster format DEM, another grid where each cell is the slope at a certain position can be created, then the maximun difference can be found and the gradient can be determined. Aspect is the direction that a surface faces. Aspect is defined by the horizontal and vertical angles that the surface faces. In a raster format DEM, another grid can be created for aspect and a number can be assigned to a specific direction.

Sun intensity is the combination of slope and aspect. Illumination portrays the effect of shining a light onto a 3-dimensional surface. (Stan Aronoff, 1989).


Thiessen Polygons

Thiessen or voronoi polygons define individual areas of influence around each of a set of points. Thiessen polygons are polygons whose boundaries define the area that is closest to each point relative to all other points. Thiessen polygons are generated from a set of points. They are mathematically defined by the perpendicular bisectors of the lines between all points. A tin structure is used to create Thiessen polygons.



Interpolation is the procedure of predicting unknown values using the known values at neighboring locations. The quality of the interpolation results depends on the accuracy, number, and distribution of the known points used in the calculation and on how well the mathematical function correctly models the phenomenon.


4. Cartographic Modeling

5. Connectivity Functions

6. Output Functions




1. conflation 2. edge-matching 3. line coordinate thinning 4. address matching 5. rubber sheeting 6. retrieval 7. classification 8. overlay 9. point-in-polygon 10. thiessen polygon 11. interpolation 12. line-in-polygon


1. Please present your own opinion on how to use GIS analysis functions.

2. Please discuss "Spatial analysis functions are the power of GIS" ?

3. Please give example to describe GIS overlay function.

4. Describe differences and relationships of spatial data and attribute data.

5. Using figures or examples to describe Edge Matching Function.

6. What is Sliver Polygon? Describe how to create sliver polygons and using what function to delete them.

7. Describe what are slope and aspect, how to measure them.

Note: please see Michael L. Hauschild's page for the last three parts.

Submitted by Chengdai Liu.