GIS Raster/Vector Data Structures, Database Management Systems, Database Organization
K Brooks
All Spatial features recorded as Geographic Primitives with several primary characteristics.
Points (0-D. no length or width);
Lines (1-D, length, no width);
Polygons/areas (2-D, length and width / area and perimeter);
Surfaces (3-D Areas with Z dimension);
Two Major Types of Spatial Data Representations of these primitives, raster and vector.
Raster/Grid/Image
Raster: matrix of cells (pixels) referenced by row/column, stored as a matrix or array;
(Example from Getting Started with ArcGIS, ESRI 2002)
Raster: For geo-referenced rasters, every cell represents a given area on the ground (resolution). The smaller the area the cells represent, the larger the data set size for a given area.
Raster: Cell values represent nominal, ordinal, or continuous data. Numbers in cells can be integer or floating point.
Raster: Attributes are the data set.

(Examples from Getting Started with ArcGIS, ESRI 2002)
Raster: To the computer, this:

Consists of this:

Raster: In the ArcGis grid
data model, data tables can store additional information about nominal/categorical
data, in the
Value Attribute Table
(VAT). VATs store information about the
categories, not about individual cells:

Example from Getting Started with ArcGIS, ESRI 2002)
Raster Coding/Representation of Geographic Primitives
Points: single cells, unique/known values;
Lines: strings of cells with common values;
Polygons/areas: groups of cells with common values;
Surfaces: cells represent real or virtual elevations;
Vector
Vector: discrete Cartesian x,y coordinates;

(Example from Getting Started with ArcGIS, ESRI 2002)
Vector: sizes of lines or areas vary, as they trace surface phenomena.
Vector: data stored as pairs of x,y coordinates, usually with ID numbers; data typically stored in separate data tables.
Vector: In ArcGis, except in polygon coverages, the data tables contain exactly as many records as there are unique features in the data set.

(Examples from Getting Started with ArcGIS, ESRI 2002)
Vector Coding/representation of Geographic Primitives
ArcGis, ESRI 2002)Points: id, x, y;
Lines: id, x1,y1 ... xn, yn;
Polygons: id, x1,y1 ... xn, yn, where xn=x1, yn=y1 (closed);
Surfaces: represented by Triangulated Irregular Networks (TINS):
(Example from Getting Started with
Feature Data Formats and Characteristics in ArcGIS
s from Getting Started with ArcGis, ESRI 2002)In ArcGis spatial and attribute data may be stored in several formats. These formats have evolved in synch with the evolution of ESRI's GIS products. [Other data formats exist, these are discussed in a later topic].
Coverage Data Model
Traditional/original Arc/Info data model.
Primary features are points, lines and polygons.
Special point types include: label points and nodes.
Composite features: Linear composites are routes/sections and polygonal composites are regions.
Secondary features: annotation, tics, and links.
Coverages are file based, consisting of multiple files housed in a system folder (subdirectory). The folder name == the coverage name.
Coverages exist in Arc/Info workspaces (subdirectories) which are characterized by the presence of an INFO folder. The examples below show system and ArcCatalog views of a workspace.
![]()
(Example
Handling Coverages: NEVER use standard Windows tools to copy a coverage; Use the ArcCatalog (or other ArcGis tools). [If you are careful you can copy an entire workspace with windows commands].
Shapefile Data Model
The shapefile format was originally associated with the ArcView GIS software.
Primary features are points, lines and polygons. These may be simple or multi-part.
Handling Shapefiles: A feature dataset in shapefile format consists of 3 or more files with the same name and different extensions. (See examples below showing system and ArcCatalog views). If system commands are used to copy a shapefile, be sure to include all the files. ArcCatalog is the safest means to copy shapefiles.

(Examples from Getting Started with ArcGis, ESRI 2002)
Geodatabase Data Model
Geodatabases are the latest ESRI data format. The implement an "object-oriented" GIS data model.
In the geodatabase, each vector feature receives a row in a data table; the vector shape is stored in the shape field, and attributes are in other fields. Each data table stores a feature class.
Geodatabases store multiple features, rasters, other data tables and references to yet others.
A single geodatabase file can store multiple features and objects. Geodatabase files are Database Management System files.
Primary features are points, lines and polygons, simple and multipart. Special features (and relationships between features) can be designed.
Special features: Point domain: points with special references or behaviors, simple or complex network junctions. Lines can be traditional x, y coordinate lines, or computed lines such as Bezier curves. Lines can also be simple or complex network "edges."
Polygons consist of the line types noted above, and may also implement complex relationships of behaviors. Multiple representations can now be arranged thematically (e.g., points, lines and polygons describing local hydrological features). Essentially the item to be mapped can be paramount, as opposed to the GIS data model.
Handling geodatabases: These can be single DBMS files. The system shows the DBMS file name, while ArcCatalog show the contents of the file:

(Examples from Getting Started with ArcGis, ESRI 2002)
Topology
Vector Data can explicitly represent spatial relationships through Topological Coding.
Coverage Topology in Arc/Info: In the coverage data model, 'Planar Topology' is strictly enforced. Its characteristics:
".. allows you to define relationships between objects, together with rules for maintaining the referential integrity between objects" (Arc GIS Help).
- Strings of points from lines;
- Lines (arcs) must begin and end with Nodes;
- Nodes must exist where any two lines cross;
- Nodes are numbered and coded as from-nodes and -to-nodes;
- Arcs join at nodes: for networks from-to coding allow us to describe connectivity ;
- Arcs are numbered, and joined at nodes to create polygons;
- From-to coding of the arcs allows coding of spatial relations between polygons ( adjacency, enclosure).
Shapefile Topology in Arc View and ArcGis 8+:
- ArcView can represent topology by ordering vertices in rings in clockwise (known) order (Theobald). Computed as needed, not a permanent aspect of the data as in the case of coverages;
- In shapefiles planar topology is not enforced: Be careful for gaps, overlaps and so on -- use snapping, vertex editing and so on to eliminate those you do not intend.
Geodatabase Topology in and ArcGis 8+:
- Arc 8's new GeoDataBase model:
(Example from Getting Started with ArcGis, ESRI 2002)
Advantages/Disadvantages/Appropriate Use of Data Types
(PA Burrough 1986)
Vector:
+ Good representation of phenomenological data;
+ Compact;
+ Topology can be completely described;
+ 'Accurate' graphics;
+ Retrieval, update & generation of graphics possible;
- Complex structure;
- Combination of overlays creates difficulties;
- Simulation difficult (non-uniform sizes);
o Display & plotting can be $expensive;
o Technology relatively expensive;
o Spatial analysis/filtering not possible.
Raster
+ Good Representation of Continuous Phenomena (e.g., elevations, reflectance);
+ Simple structure;
+ Overlay, & combining with RS data is easy;
+ Various spatial analyses easy;
+ Simulation easy (uniform size)
+ Technology is inexpensive, dynamic development is easy;
- Data sets can be quite large;
- Use of large cells introduces imprecision, mixed (partial) cells;
- Crude raster maps not aesthetically pleasing;
- Network (topological) linkages difficult to establish;
- Spatial relationships implicit rather than explicit;
- Some operations CPU intensive (e.g., projection, re-sampling).
Comparison table of two data structures and their applications
|
Vector (inventory) |
Raster (analysis) |
|
Lines real |
Lines artificial (pixels) |
|
Data known (pre-interpreted) |
Data Probabilistic |
|
Descriptive Inquiries |
Prescriptive Analysis |
|
Computer Mapping |
Spatial statistics |
|
Spatial DBMS |
Modeling |
Adapted from J. Berry, 1993, Beyond Mapping, Table 2.1, page 11.
Database Management Systems
Computer-based systems for creation, manipulation, management and update of data;
Benefits & Liabilities of Data Base Management Approach
+ Central Control;
+ Data independence;
+ Easier Implementation of New Applications;
+ Direct User Access;
+ Control Redundancy;
- $$$ (Expense);
- Complexity;
- Centralized Risk.
Modern GIS Links geography and attributes .
Basic Database Components
Types of Database Management Systems
Flat File: simplest, most common system type found in PC environment;
Hierarchical: structured as a hierarchy, need to navigate up and don thru hierarchies;
Network: Hierarchical with links across levels;
Relational; most common DBMS used in GIS: INFO, Oracle, Sybase are examples;
Relational: normalization reduces redundancy, facilitates updates; successful implementation depends upon key codes, where each object (row) contains at least one unique identifier, with one-one relation to the geography.
GIS Database Organization
Map/Coverage Organization
Arc/Info coverages are physically directories on the computer;
Each directory has the name of the coverage; In the directory are multiple files which together constitute the coverage;
Arc/Info coverages must reside in an Arc/Info workspace, characterized by presence of an INFO subdirectory;
ArcView can access this data structure; ArcView coverages (shapefiles) consist of two shapefiles (name.shp, name.shx), and a DBASE file (name.dbf);
Other GIS systems adopt similar strategies: most use more than one file to contain a coverage.
Data Base / Study Area/ Multi-Coverage Organization
Depending upon the scale and extent of a study area, GIS data bases may be organized into either a single, seamless coverage, or into multiple, spatially integrated coverages;
Multiple maps for a single study area are sometimes termed tiles;
Digital Map libraries may consist of multiple tiles, each containing multiple thematic layers;
Higher-end software (Arc/Info) has optional software to assist in management of map libraries (Arc Librarian, Arc Storage Manager (STORM), Arc SDE, (Spatial Database Engine), allowing high-end DBMS tolls to be employed.
Return to LA 467 page.