Cartography and the Internet:

Introduction and Research Agenda

 

Michael P. Peterson

University of Nebraska at Omaha

geolib@cwis.unomaha.edu




Abstract


The Internet is helping to redefine how maps are used. Maps are now delivered to the user in a fraction of the time required to distribute maps on paper allowing them to be viewed in a more timely fashion. Weather maps, for example, are posted on an hourly basis. Most importantly, maps on the Internet are more interactive. They are accessed through a hyperlinking structure that makes it possible to engage the map user on a higher-level than is possible with a map on paper. Finally, the Internet is making the distribution of cartographic animations possible. The Internet presents cartographers with a faster method of map distribution, different forms of mapping, and new areas of research.


Introduction

Large numbers of maps are now distributed through the Internet. Individual web sites are respond to over 700,000 requests for maps every day (e.g. GeoSystems MapQuest) and there are thousands of web sites that distribute maps. A major reason for this change in how maps are delivered to the user is cost. It is simply less expensive to distribute maps through the web than it is to print and distribute them on paper. A second reason is time. Maps on computer networks are delivered in a fraction of the time previously required for maps on paper. This also makes the delivery of maps more current. A third reason is the potential for interaction. Users can choose a location to map and the features to include on the map. A fourth reason is the potential for the display of cartographic animations, a long neglected form of mapping. These are just a few of the reasons that the distribution of maps through the Internet is growing rapidly.
This paper and the associated documents on the World Wide Web (http://maps.unomaha.edu/NACIS/cp26/article1.html) examine this new way to distribute maps. The first section will review the growth, development, and use of the Internet, the World Wide Web, and Web search engines. The second part examines the current state of Web-based mapping. The last part addresses those areas of cartographic research associated with the Internet that need further research.



I. The Internet and the World Wide Web

The Internet has been described in many ways. In the simplest sense, the Internet may be thought of as a system for transferring files between computers. These files, manipulated as numbers and ultimately stored and transferred in binary 0s and 1s, may consist of text, pictures, graphics, sound, animations, movies, or even computer programs. Defined in terms of hardware, the Internet may be thought of as a physical collection of computers, routers, and high-speed communication lines. In terms of software, it is a network composed of computer networks that are based on the TCP/IP protocol. In terms of content, the Internet is a collection of shared resources. Finally, and most importantly, from a human standpoint, the Internet is a large and ever-expanding community of people who contribute to its content and use its resources.

The beginnings of the Internet can be found in ARPANET - a computer network created for the Advanced Research Projects Agency and funded by the U.S. Department of Defense. The initial purpose of the network was to help scientists work together and also to create a network with a redundantly linked structure that would continue to work even after a limited nuclear attack. The original Network Control Protocol (NCP) was implemented in 1969 between Stanford University, UC-Santa Barbara, and the University of Utah. ARPANET switched from the NCP protocol to the currently used TCP/IP (Transmission Control Protocol/ Internet Protocol) on January 1, 1983. Many view this date as the beginning of the Internet.

The ARPANET model specified that data communication always occurs between a source and a destination computer. Further, the network connecting any two computers is assumed to be unreliable and could disappear at any moment. Data sent from computer to computer was put in an "envelope," called an Internet Protocol (IP) packet, with an appropriate "address." The computers - not the network - had the responsibility for routing the messages. All computers could communicate as a peer with any other computer. If a certain connection between two computers was inoperative, the computer would reroute the message to another computer that would attempt to "deliver" the message.

The ARPANET model was attractive to governments and universities that had not standardized on a particular computer system. The data communications model specified by ARPANET was emulated on a local level to connect often different computers within an organization, particularly when desktop workstations became widely available by the mid-1980s. Workstations, in particular, created a new model of networking. Rather than connecting to a single large timesharing computer per site, users wanted to connect their entire local networks to ARPANET.

The model was also used in the late 1980s by NSFNET, commissioned by the National Science Foundation (NSF). NSFNET was designed to distribute the computing power of five supercomputers at major universities so that they could be used for scholarly research. Increasing demand on the network throughout the 1980's forced the U.S. government to commission the NSF to oversee the entire Internet network. More research and educational institutions were connected on a high-speed Internet "backbone." Eventually, Internet service providers (ISPs) expanded the network to include telephone access from homes.

The Internet has become an international computer network that links academic, military, government, and commercial computers. It is not managed by any one entity; rather, it is a system of networks based on the TCP/IP protocol that are linked together in a cooperative, non-centralized collaboration. The Internet consists of five main components or protocols:1) File Transfer Protocol (FTP) for exchanging files between computers; 2) Telnet - a remote log-on procedure for accessing programs on remote computers as though they were local; 3) e-mail - an electronic mail system whereby one can exchange mail messages between Internet users and many networks outside the Internet (e.g., BITNET); 4) Newsgroups - discussion groups which distribute information to groups of users providing a forum for researchers; and 5) the World Wide Web - a hypermedia system that incorporates most aspects of the previous four services and delivers files in multiple forms, including text, pictures, sound, and animation.

The text-based file transfer systems, including FTP, Telnet, e-mail, and newsgroups developed quickly throughout the 1980's. FTP servers became fairly widespread by the end of the decade but as the number of available files kept increasing, searching for a particular file became unmanageable. Search systems, including Archie and Gopher, were established to help find particular files but the complexity of these systems limited their general usefulness. The prevalence of text files and the difficulty of transferring and viewing graphic files made FTP less than appealing to most computer users.

The World Wide Web
The introduction of the World Wide Web in the early 1990's addressed many of the problems associated with using the Internet. Files could now be accessed using a pointing device such as a mouse. A link within a document could access another document on that computer, or any other that supported this protocol. The selection of a link automatically made a connection to the remote computer and downloaded the document which could be a text, graphic, sound, animation, or any other type of file. Based on the concepts of hypertext and hypermedia, the web promoted a logical linking of files, much as the human brain links related pieces of information. The World Wide Web was a milestone in network computing technology because it opened the Internet to people with little computing background. It is largely responsible for the dramatic growth of the Internet during the early part of the 1990's.

The World Wide Web was developed in 1989 at the European Particle Physics Laboratory (CERN) located near Geneva. Tim Berners-Lee played a large role in designing the system. It was intended to assist researchers in high energy physics research by linking related documents. The developers wanted to create a seamless network in which information from any source could be accessed in a simple and consistent way. The WWW introduced the principle of "universal readership," which states that networked information should be accessible from any type of computer in any country using a single program. A prototype of this new protocol was completed in 1991 and was widely accepted by 1994. The system was quickly embraced because it also incorporated the previous protocols for file exchange, including FTP, newsgroups, and mail.

The early popularity of the WWW was demonstrated by the quick adoption of the Mosaic WWW browser. Developed and freely distributed by the National Center for Supercomputer Applications (NCSA) in Urbana, Illinois, Mosaic was released for all common computer platforms, including UNIX, PC/Windows, and Macintosh, in September of 1993. Implementing the hypermedia file-access structure, the program incorporated hypertext and hyperimages to create links to other documents, either text or graphic. The growth of the web during this period was particularly dramatic. Much of the increase in WWW traffic can be attributed to Internet access providers, including such commercial ventures as America Online.

World Wide Web Browsers
Mosaic from NCSA was the first widely-accepted, multimedia-based web browser. Many other web browsers have since become available, although some browsers, such as Lynx, only display text. One of the more popular browsers is Netscape Navigator. Its main programmer, Marc Andreessen, wrote Mosaic then left NCSA to help form Netscape Communications, Inc. The company experienced phenomenal initial investment in the mid-1990s based on the expectation that there would be continued growth of the Internet, particularly the web. A variety of other browsers are also available. Updated versions of Mosaic can still be obtained at no cost from NCSA, and Microsoft provides the Explorer browser along with its Windows operating systems. Updates to the popular browsers are available through the web, and new functions are being added to the software on a regular basis.

The capabilities of World Wide Web browsers can be enhanced with "helper" programs called plug-ins that add certain capabilities to the browser. For example, most browsers cannot display QuickTime or MPEG files which are standard formats used for the display of animations. Separate plug-ins must be downloaded and installed to display these files.

All browsers download and display material from an "http" site (HyperText Transfer Protocol). The "http" address has a consistent structure, as in the example below:

http://maps.unomaha.edu

The "http" prefix is always followed by a colon and two slashes (some browsers, such as Netscape, no longer require the "http://" as part of the address). Following this is the actual address beginning with the name that has been assigned to a particular computer; in this case, "maps". Next is the "domain" name that indicates where that computer is located (the University of Nebraska at Omaha, or "unomaha"). Finally, the "edu" specifies that the computer is at an educational site. This particular address will display a "home page" for that computer. By adding directory and file name information, a user can access other files on the system:

http://maps.unomaha.edu/NACIS/cp26/article1.html

This address would display a file called "article1.html" within a directory (or folder) called cp26 that is within the directory NACIS. The file is the web page associated with this article.

Web Search Engines
A search engine consists of two major types of programs. The first examines all known web pages and creates an index based on a defined set of keywords. The second program responds to user "keyword" requests to this index. A particular keyword may return many "hits" or matches. The ordering of matches is based on a variety of criteria, sometimes a function of how often the particular keyword is included in the document. Search engines work continuously to find and sort new material. For example, AltaVista, operated by the Digital Equipment Corporation, indexes material - "crawls the web" - at 3 million pages a day. By mid-1996, AltaVista had indexed 30 million pages and the site was receiving twelve million daily keyword requests. The purpose of the search engine is to find new material and to update HTTP addresses to pages that have already been indexed.

Depending on the search engine, a keyword will return a large number of documents. For example, the keyword "maps" returns 1,549,205 matches with the AltaVista search engine (February 1997). This means that the search engine found this many documents that contained the word "maps". The combination of "maps+world" returns only 800. There are many ways of limiting the search to a more specific topic, but the syntax for doing so varies among the different search engines. Effectively "surfing the web" requires a good working knowledge of several search engines.

II. Maps on the Web

Graphics, including maps and images taken from satellites, have become a major component of the web. This can be attributed to the relatively lower cost of placing color graphics on the web compared to printing. When the costs of shipping and distribution are added, the cost advantages of distributing maps and images over the Internet become even more apparent.

Graphics on the Internet are usually based on a raster format in which the image is represented as a grid of "picture elements" called pixels. Each grid square is assigned a number that is represented as a grey shade or color. The most common grid format for graphic files is GIF (Graphics Interchange Format). Limited to 256 shades or colors, GIF files have become a standard way of distributing pictures in electronic form. This graphics format is widely adopted and supported by most of the web browsers. An alternative image display format is JPEG (Joint Photographic Experts Group). This format is better suited for pictures because it is not limited to 256 shades or colors. However, the format makes use of compression algorithms that result in a loss of detail. Although not apparent on pictures, this loss of sharpness is noticeable on maps in a fuzziness introduced in the line-work.

Static Maps
Many of the static maps available on the Internet have been scanned from paper maps and stored in a GIF or JPEG format (
scanned map of Africa). Although the scanning of maps quickly converts a map into digital form for transmission, the maps that result are often illegible. Sometimes, so little care is taken in the scanning process that the text on the back side of the paper map will appear in the scanned version. The screen pattern (little dots) will be visible on printed maps, particularly those printed in color.

Other forms of static maps include weather maps, maps of demographic distributions (
Current temperature map, United States Per Capita GNP ), and other types of thematic maps. Most of these maps have been designed specifically for display on a computer terminal and are much more legible than maps that have simply been scanned. Weather maps, in particular, account for much of the network traffic. They have been displayed on television for many years and incorporate design considerations that make them more suitable for display on a computer monitor. All of these maps are in the GIF format.

Static maps with a higher spatial resolution are also available on the Web. A common file type that is used for these maps is Adobe's Portable Document Format (PDF) based on Adobe's widely used Postscript language. PDF files are designed for both screen viewing and for printing. Because they are "resolution independent," so they can be made larger without loss of detail, and can also take advantage of the resolution of the printer. The
Adobe Acrobat Reader is a plug-in that displays and prints these files.

Interactive Maps
To overcome the limitation of "spatial resolution" maps available through the Internet are typically more dynamic. They are updated frequently, generally incorporate some type of interaction, or display a series of maps that can be viewed as an animation. The combination of maps with the Internet is a significant development, not only for improving the distribution of maps but also because it makes a more interactive form of mapping possible a form of mapping that engages the map user to a much greater extent than maps on paper.

A variety of web sites incorporate interactive maps. The user can change these maps by choosing various map display options. Map sites, such as those located at Xerox Parc and the Fourmi Laboratory in Switzerland, are early examples of the type of interaction that can be implemented with maps on the web. The interactive Xerox Parc site allows the display of alternative projections and separate map layers including country boundaries, waterways, and transportation networks. The site responds to nearly 90,000 daily requests for world maps. The map site at the Earthview displays views of the earth from the sun, the moon or orbiting satellites, and includes the overlay of current cloud patterns derived from weather satellites.

Maps that are frequently updated include maps of traffic flow, as in the example of the current traffic in Houston. Interactive street level mapping of the US is available from MapQuest, MapBlast, and MapOnUs. These maps are based on the TIGER map file, a product of the U.S. Census Bureau, which maintains its own mapping site. The location of bank teller machines on these maps can be obtained through a site supported by VISA. Interactive mapping with demographic data is available through CIESIN. This site lets the user choose an area and a data value to map within the United States. Geomatics Canada has implemented a type of on-line, interactive mapping system that has some of the characteristics of a GIS.

Animated Maps
Animated maps are also available through computer networks. Map animations are usually stored in a format designed for the display of digital movies, such as QuickTime or MPEG. The most common examples of animated maps on the Internet are those of weather patterns, most often depicting the movement of clouds as seen on television weather forecasts. The movement of cloud patterns associated with hurricanes is especially suited for viewing as an animation. Other types of animated maps include terrain fly-throughs in which a landscape, usually somewhat mountainous, is viewed as if it were being flown through with an airplane or jet. Animations are also available showing population growth in a region. Here a shading is applied progressively to depict the pattern of population growth. Finally, animations are available that depict temporal trends or alternative methods of data classification, such as changes in the classification method or number of classes.


III. The Internet and Cartographic Research


One of the major problems associated with maps is the difficulty many people have in using them. It has been estimated that more than half of the educated population do not have a basic understanding of maps. The reasons for this are not well understood. Some see the problem as a lack of education specific to map use, while others say it is the maps themselves or, even more specifically, the static means of displaying maps on paper. Whatever the reason, it is clear that people have a poorly formed mental representation of their local environment that becomes more distorted in the areas beyond their direct experience.

The Internet has made possible both new forms of maps and different ways of using them and, perhaps, has created a new category of map user. The phenomenal growth in this new form of mapping has highlighted a number of areas in need of research that can be broadly categorized under "map use" and "map development." Map use will be examined first.


Internet Map Use


Number of Maps Distributed A large number of maps are sent through the Internet but the exact number is difficult to determine. Sites that advertise generally maintain records on the number of "hits" (file accesses). However, many sites that distribute maps do not advertise and therefore have no reason to keep a record of the number of people that access their site, or use their maps. How can map use on the Internet be measured when records are not kept of the number of maps that are distributed? One solution is to maintain a centralized, on-line data base of registered Internet map publishers along with records on the number of "hits" per map. The automated system would poll each site on a daily or weekly basis and record the number of hits for that time period.

Types of Internet Map Use How maps are being used on the Internet, and the profile of the map user are more difficult to ascertain than simple "hits." Is the purpose of map use "casual" or "goal-oriented?" Is there a preference for certain kinds of maps? How are interactive maps being used?

Such questions are being addressed by research associated with the development of the Alexandria Digital Map Library. The use of maps in this library is being studied by examining the log file of its web sessions. The log files maintain information on the types of maps that are accessed, how long they are viewed, what map is viewed before and after, and where the user clicks on the map. While this could appear to be an invasion of privacy, users entering the site are warned that their use of the maps is being monitored. Analysis of the log files provide valuable data on how the maps are being used, and who is using them.

Internet Map Use by Children - Despite a concern about the adult nature of some of the material on the Internet, it is clear that the Web is being used in education at all levels. The implications to cartography are enormous, particularly in the training of a new generation of map users. Current maps on weather, earthquakes, volcanic activity and other natural phenomena not only bring greater interest to these subject areas, but also promote the use of maps in their study. The graphical nature of the Web can promote the use of maps in a variety of ways. The use of Internet maps by children, in particular, deserves study.


Internet Map Development


The second major area of research is the development of new and better forms of mapping. Web maps have already permitted new forms of interactive mapping, and this may continue to be an area where most developments occur. However, a great deal of research and development work needs to be done with the other two forms of mapping static and animated.

Graphic Design Issues of graphic presentation have been difficult to address in a typical research framework. Through the Internet, many different types of maps can be viewed, thereby making comparisons and evaluations possible. A "gallery" of well-designed maps could be promoted to help improve map design. Although the quality of maps presently available through the Internet is not very good, the access to maps that it provides improves the exchange of information, including that on map design. Research is needed on how this exchange of information can best be accomplished.

File Formats Most maps are presently distributed using the raster (grid-based) GIF and JPEG formats, and the proprietary, vector-based Adobe Acrobat PDF format. None of these formats are designed for maps. GIF and JPEG are used mainly to distribute pictures, the former limited to 256 colors and the latter implementing a sophisticated compression scheme. The PDF format is designed for the distribution of printed documents. All of these formats produce larger files than would be necessary for the distribution of maps, and therefore take longer to download.

This problem could be solved by more compact map file formats and methods of map distribution where the receiving computer (client) would partially "reconstruct" the map. The format could even be specific to a map type. For example, a format for a choropleth map could consist simply of polygon outlines in screen coordinates, associated shading values, and text strings. A client-side plug-in or Java applet would then reconstruct the map for display. The advantage of this approach becomes clear when another data set is chosen for display, and it simply downloads an updated set of shadings and text strings to update the map.

Map Printing - Another issue associated with static maps is the distribution of maps intended for printing. Paper maps will not disappear because of the Internet. On the contrary, while fewer maps may be mass-produced on printing presses, more individual maps will likely be printed with computer printers. The question then focuses on the quality of these individually printed maps when there is such a variety of printers available. To some degree, the widely used Postscript page description language that is used by higher-end printers addresses many of the basic printing issues but it cannot adjust for the different ways that printers render colors, or all of the nuances associated with printer resolution.

Map Scale - The representation of map scale is a problem that is related to both the display of maps on the screen and the printed product. Display devices have different resolutions, and display text and graphics at different sizes. Traditional methods of representing scale, such as the representative fraction (e.g., 1:24000) or verbal scales (e.g., 1 inch:10 miles; 1 cm:10 KM) are not viable unless the size of the map can be controlled on the screen and in the printed version. The graphic scale is a solution to this problem because its size varies with the size of the map. However, the use of the graphic scale is inexact and cumbersome. If the representative fraction and verbal scales are to be used, efforts need to be taken to standardize the displays on both monitors and printers.

Large Format Maps - Finally, the printing of large format maps is an additional problem. The
Maps on Demand system from 3M Corp. (http://www.mmm.com/front/bosnia/index.html) solves this problem by combining a large-format, ink-jet printer, specialized paper, and a print engine. However, the rasterized file required by the system is enormous and cannot be easily transmitted through the Internet. Plans are being made to maintain the basic information on CD-ROM and only transmit the changes by Internet. The system demonstrates that high-quality, large-format maps can be printed one at a time, and this method of printing can be at least aided by the Internet.


Conclusion

Maps are an important source of information from which people form their impressions about places and distributions. Each map is a view of the earth that affects the way we think about the world. Our thoughts about the space in which we live and especially the areas beyond our direct perception are largely influenced by the representations of space that we see, and the way we think about our environment influences the way we act within it. The Internet has already improved the distribution of maps. If done properly, the Internet also has the potential to improve the quality of maps as a form of communication, thereby changing both the mental representations that people have of the world and how people mentally process ideas about spatial relationships.

The Internet has changed the process of mapping and map use. The new medium has already led to more interactive forms of mapping and and the increased availability of map animations. A great deal of work will certainly be done in these areas. It is also important, however, to look at problems in the distribution of static maps and issues that deal with their display. Certainly, much work lies ahead in order to make the Internet an effective means of transmitting spatial information in the form of maps.



Internet References

On-line book about the Internet


Zen and the Art of the Internet
(http://www.cs.indiana.edu/docproject/zen/zen-1.0.html)

World Wide Web


An introduction to the World-Wide-Web
http://aslan.math.hmc.edu/codee/intro-www.html

World Wide Web Frequently Asked Questions
http://www.fcm.missouri.edu/book/www.htm

Glossary of World Wide Web Terms
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Glossary/
GlossaryTable.html



World Wide Web Resources

An Overview of Browsers
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Glossary/
GlossaryTable.html

QuickTime Plug-in
http://quicktime.apple.com/sw/sw.html

MPEG Plug-in
http://www.intervu.com/player/player.html


Search Engines


How To Use Web Search Engines
http://www.monash.com/spidap2.html

AltaVista Search Engine
http://altavista.digital.com


Static Maps


Africa (scanned)
http://www.lib.utexas.edu:80/Libs/PCL/Map_collection/
africa/Africa_pol95.jpg

Burma (scanned)
http://www.lib.utexas.edu/Libs/PCL/Map_collection/
middle_east_and_asia/Burma.GIF

High Temperature Map
http://www.intellicast.com/weather/usa/hitemp/

United States Per Capita GNP
http://www.oseda.missouri.edu/graphics/us/pci90.gif

Files in Adobe Acrobat Format
http://maps.unomaha.edu/Peterson/funda/Maps/

Adobe Acrobat Plug-in
http://www.adobe.com/prodindex/acrobat/readstep.html


Interactive Maps

Earthview
http://www.fourmilab.ch/earthview/vplanet.html

Current traffic in Houston
http://herman.tamu.edu:80/traffic.html

MapQuest
http://www.mapquest.com

MapBlast
http://www.mapblast.com

MapOnUs
http://www.maponus.com

U.S. Census Bureau
http://tiger.census.gov/cgi-bin/mapbrowse-tbl/

VISA
http://visa.infonow.net/powersearch.html

Canada
http://ellesmere.ccm.emr.ca/naismap/naismap.html


Animated Maps


Hurricanes
http://maps.unomaha.edu/AnimArt/Hurricane.mpeg

Terrain Fly-through
http://maps.unomaha.edu/AnimArt/flatgc.mov

Weather Fly-through
http://maps.unomaha.edu/AnimArt/CloudFLY.mov

Population Growth
http://maps.unomaha.edu/AnimArt/sf_bay900.mpeg

Spatial trend
http://maps.unomaha.edu/AnimArt/SpatTrend.MOV

Classification Method
http://maps.unomaha.edu/AnimArt/Class_Anim.MOV

Number of Classes
http://maps.unomaha.edu/AnimArt/GenAnim.MOV