Database Concepts
Outline
- Geodatabase Concepts
- Image vs. Attribute
- Information vs. Data
- Database Management System (DBMS)
- functions
- security
- efficiency
- Components and Characteristics
- field
- data items (attributes)
- entity - collection of related data items
- defined as physical object in GIS
- instance - an instance (county) of an entity
- record - row or line in a table
- access to multiple users
- direct file access vs. access to a copy
- centralized control and access
- data independence - from knowledge of the specific file structure of the DBMS
- multiple user views - different view of data depending on application
- Types of databases
- Flat File - excel file
- Hierachical - file system tree
- Network - multiple parents | possibility for many-to-many relationships
- Relational - tables related through keys
- Hybrid - coordinate data in one database type, attribute data in another | integration of topology
- Normal forms of RDBMS - break large tables into small tables that contain simple functional dependencies
- Questions
1. Geodatabase Concepts
Image vs. Attribute
All GISs have to store digital maps somehow. The organization of the map into digits has a major impact on how we capture, store, and use the map data in a GIS. GISs store data in an organized way as a database. A database is a structured collection of information on a defined subject.
Information versus Data
Information is the act of making sense or meaning of data. Data can be just numbers. Internal computer manipulation is not data, a person needs to be involved.
2. Data Management
Functions
Databases must be organized so that the data contained in them can be retrieved, corrected, deleted, or organized consistently, efficiently, and quickly in a programming construction called a data base management system (DBMS). A DBMS has several consistent parts. The database definition language is the part of the DBMS that allows the user to set up the database, to specify how many attributes there will be and how they are defined and manipulated. The data dictionary contains a catalog of all the attributes with their legal values and ranges. Use of a query language allows a user to perform advanced functions such as sorting, reordering, subsetting, and searching. Databases are normally constructed by the following steps:
- Define the database contents.
- Insert new data into the database contents.
- Delete old data.
- Query - whats in the content.
- Modify the content.
Necessary information needed from the user includes:
- Data format definition.
- Data contents definition.
- Value restrictions.
Security
Security is necessary to ensure the utility of the database. Key security concerns include:
- Limit access via passwords and programming techniques which restrict user capability to alter and manipulate data.
- Integrity is maintained by programming techniques which allow only values at the appropriate ranges and types to be entered in the correct field.
- Syncronization allows the simultaneous use of the database by multiple users. Procedural or programming techniques prevent the inadvertent alteration of database fields by multiple users.
- Physical data independence allows the data to be hardware independent.
- Accurate database manipulation and control allows for the minimization of redundency.
Efficiency
Databases require that data must be able to be retrieved efficiently and easily. The storage of data has a large impact on the capability to accomplish efficient database retrieval.
3. Components and Characteristics
Fields
Fields are the values for each attribute. Fields may be fixed or variable length. Fixed length fields are normally used for data that is used for mathematical or other calculations, such as pixel values, where accuracy in data entry is important. Variable length fields are used or highly variable data such as descriptive data. Database data is normally access either by sequential or random access methods. Sequential access is accomplished when data is arranged in contiguous, linear fashion. An example of this method is tape storage. Sequential access is easy to program, however, it is generally slower than the random access method. Random access allows access to any data located on the storage device in a non-linear fashion. An example is the CD-ROM.
data items
(attributes)
entity
- collection of related data items
instance
- an instance (county) of an entity
record
- row or line in a table
access to multiple users
data independence
- from knowledge of the specific file structure of the DBMS
multiple user views
- different view of data depending on application
4. Types of databases
Data retrieval is dependent on a data model. A data model is a theoretical construct that becomes the key to performing necessary database functions.
Flat File.
A flat file data model is constructed like a table. Data is stored in rows and columns with an index table which contains the locations of the data.
excel file
Hierachical
An hierachical/network file is organized into levels. A level contains nodes. Each node is linked to nodes beneath it on the next level. As a result each level gets increasingly complex. An example is the file structure on a typical computer. Hierachical files can be inefficient in some respects. It is difficult to allow for redundent data. Searching in a hierachial file can also be inefficient since searching may require looking down mutiple levels repeatedly.
Network
Relational
Relational databases are an extension of the flat file. It consists of a series of related tables. The tables are interconnected via a key field. Use of the key field allows the combination of the tables by indexing against the key field. The relational database was introduced in 1970 and is currently the most popular method used.
Hybrid
Normal forms of RDBMS
break large tables into small tables that contain simple functional dependencies
5. Questions for Database Concepts
- Describe the differences between vector and raster data structures.
- Describe topology and why it is important to GIS capabilities.
- Explain how a database is created.
- Describe the information normally provided by the user of the database.
- Explain the differences among flat file, hierachical/network, and relational data models.
Submitted by Dave Gay on 18 April 98. Updated by Andrew Young on .....