Vector vs. Raster Data Models

Within geographic information systems (GIS), there are two main ways to model features across space: vectors and rasters. Both types of data models have unique advantages and disadvantages, and come with assumptions and limitations. Geographic research often involves analyzing data represented as one or both types of data models.

Vector Data Model

Imagine a stick figure for a moment. Stick figures are essentially vectors — their head, legs, body, and arms are made of points, lines and polygons; the three ingredients of a vector data model. Ultimately, all vectors are created by points, the basic building block of lines and polygons. Points represent an exact location but do not themselves take up space. Combining points in a linear fashion will create a line segment. Technically, lines are composed of an infinite number of individual points, so that any location along a line can also be associated with a single point, or exact location (typically on the Earth’s surface). Finally, line segments can be combined to form shapes with three or more sides, known as polygons.

Vectors are composed on points, lines, and polygons, as in this map of parcels in Omaha, NE.

Most geographic features can be represented as some combination of points, lines, and polygons. Which of the three is best used to represent a given feature, however, often depends on scale. For example, a relatively close-up (large scale) view of a city often requires a polygon to show the extent of the municipal boundaries or built-up area. If the feature takes up much of the territory or map being viewed, then a polygon is probably most appropriate. If you zoom out to the national level (small scale), however, it makes sense to represent cities as discrete points since they take up little space at such a broad scale.

Linear features like streams and roads are most often represented as line segments since they are much longer than they are wide. However, just like with cities, if you zoom in far enough you’ll see that all roads and streams have some width. In a vector data model, these linear features may be represented by a line at small scales and polygons at large scales (i.e., zoomed in), or they might be represented by lines at all scales. Often it depends on how much detail is actually needed to do a particular analysis or convey certain information.

Vectors are frequently used in all kinds of applications. One common arena is urban planning, where parcels and buildings are often represented as polygons, roads as lines or polygons, and small features like fire hydrants and telephone poles are represented by points. The U.S. Census Bureau also relies heavily on the vector data model, with census units like block groups and census tracts represented by polygons. All census data is then tied to these units for analysis.

Raster Data Model

Imagine zooming in really close to a digital photograph. What you’ll see eventually is that the entire photograph is actually composed of thousands (if not millions!) of small boxes known as pixels. A one megapixel image, for example, contains exactly one million pixels. Digital images are rasters; information is encoded in a continuous layer of grid cells arranged in a matrix across a surface. Every grid cell in a raster data layer is associated with one or more attributes or values. This can be a categorical value such as grass, trees, or water, or a numerical value such as inches of rainfall. Importantly, every space in a raster has a value, even if that value is zero or “no data”.

Vector data is composed of grid cells, each with an associated value.

While vectors are quite good at representing individual features such as buildings or streams, rasters are ideal for modeling variables that vary continuously over the Earth’s surface, such as land cover, rainfall, elevation,  and concentrations of air pollution. Rasters can also be very useful in calculating landscape change and estimating values across a surface. If, for example, you would like to know how much annual precipitation has changed over a certain period of time and across a given landscape, you could simply subtract one precipitation value from another for each grid cell to a get a change value. Additionally, if you want to estimate precipitation across a broad area but only have actual measured values at a few point locations, you can create a surface of estimated values in the form of a raster. This is a common type of spatial analysis that can be accomplished using different methods within a GIS.

Raster datasets come from a variety of sources, but one of the most common forms is remotely-sensed imagery. Remote sensing involves collecting data at a distance. In geography this often involves photographing or otherwise sensing the Earth’s surface using elevated platforms such as planes, helicopters, and satellites. Remotely sensed images can be imported into a GIS where they can then be viewed and analyzed using a variety of techniques. One common procedure is re-classifying remotely-sensed imagery in order to emphasize certain features in the landscape or assess the size of, or change in, a certain attribute.

A 30×30 meter resolution raster dataset of land cover data in the Chicago region. Note that urban land cover is red. Source: USGS.

Raster or Vector?

Deciding whether to use a vector or raster data model in your work entirely depends on the data you have at hand and what your goals are for displaying and/or analyzing the data. There are many analysis that make use of both data models or require the conversion of one to another. While conversion is a common procedure, it’s recommended that any translation between raster and vector be kept at a minimum to avoid accumulating error in your spatial model.

Land cover represented as a raster (left) and as a vector (right). Source: UConn library.

The size of the dataset may also be a consideration, as raster datasets can be quite large and difficult for some computers to process in a timely fashion. Often it is recommended to use vector data unless modeling a continuous surface. Furthermore, when using a raster data model it is important to use cell sizes that are appropriate for your analysis (i.e. an appropriate resolution).

(Visited 445 times, 1 visits today)