April 24, 2018
April 26, 2018


The process of inserting data in GIS system, in order to analyze and edit them, is the most difficult and time-consuming procedure in the whole GIS software. The amount of data used to create either a base map or a layer with information placed above the base map is usually enormous. With the evolution of science, however, a new “science” or method we could say better, developed. This method is called data capturing and is referred in the process of inserting data in a system in an automatic way and not as a direct result of data input, using a different activity or a different software than GIS for example. Data capturing could be indeed considered as a whole science as it includes many cases of initial sources and forms of documents which urge to insert them in GIS and digitize them.

More specifically and with regard to GIS, data capturing is a technology used to insert digital information from various sources. Even if these sources are in a proportional format, data capturing has the ability to convert them in digital format while entering the GIS system. It is very important for the user, to know what is the project about, in order to take the right decision about the most appropriate method of data capturing. Some of these methods may be the conventional scanning which inserts a map on GIS as a raster file, survey data entered by a technique called COGO which is applied in survey devices for digital data collection systems, imagery from satellites, LiDAR etc. Of course, all the above sources give data in digital format and we shouldn’t forget that there are also originations such as letters, emails, conventional survey instruments which demand human intervention. This human intervention includes the traditional way of manual digitizing data by georeferencing a scanned image and giving coordinates in the 4 edges of the picture.


The process of data capturing can be divided into categories depending on the sources where the data come from as showed below.

  1. Primary data sources:

     Includes data in digital format collected for direct use in GIS software. It is divided into two categories:

  • Raster data capturing: Data originated from remote sensing with information about physical, chemical and biological objects on earth. The data have a spectral, spatial and temporal resolution which means that there are specific characteristics of the final image and an updated image for each repeated cycle of the satellite.

  • Vector data capturing: Consisted of two branches, ground surveying, and GPS.

  • Ground surveying obtains data for buildings, boundaries, and others with high accuracy, but it is time consuming and expensive method.

  • GPS includes linear and point location data. The data comes from satellites and lately from a new technology named LiDAR with image data from aircraft using a laser for topographic projects. GPS gives us also the ability to enter attribute data simultaneously with the data capturing.

  1. Secondary data sources:

    Data capturing for several purposes in digital or analog format and transformed in the right format for GIS use.

  • Scan for secondary data capturing of raster: Documents, films, paper maps and aerial photographs are scanned in order to reduce decay and to constitute the basis where other layers will be placed on.

  • Methods for secondary data capturing of vector:

  • Digitizing and vectorization to convert raster data into vector and digitizing vector objects while reducing measurement errors during digitization

  • Photogrammetry gives data for various scopes. Measurements on the images taken with photogrammetry give data used in GIS.

Methods for secondary data capturing of vector:

COGO data capturing is an abbreviation of “coordinate geometry” which means that geometry and coordinates are the two main

Methods for secondary data capturing of vector:

Furthermore, it is usually necessary to capture attribute data also, but this process relegates us in other methods.


There are some recognition engines used to capture, digitize and recognize data no matter what its form is. There are mechanisms for different fonts and different types of documents. The documents are divided into 3 categories depending on their structure:

  1. Structured shape: Simple data capturing as the shape is the same for all the documents.
  2. Semi-structured shape: Complicated but not impracticable data capturing. The shape depends on varied parameters.
  3. Unstructured shape: There is no specific shape in these documents.

Some of the machines used for Data Capturing in GIS are listed below:

  1. OMR: Optical Character Recognition. Technology which recognizes symbols in checkboxes.
  2. ICR: Intelligent Character Recognition. It has the possibility to read manuscript data in European and American writing style.
  3. BCR: Barcode Character Recognition: As it is mentioned in its name, this engine is ideal for barcode recognition.
  4. OCR: Optical Character Recognition. Used to recognize and capture characters produced from machines such as computer printers, typewriters etc.
  5. OCR-A & OCR-B: Data capturing for postal and banking documents printed in OCR-A and OCR-B fonts respectively.
  6. MICR CMC7- E13B: It is referred to code lines CMC7 and E13B.
  7. CHR: Used for handwritten documents but not in capital letters. Its advantage is that this engine can recognize unconstrained frameworks, liberally written and without affiliation.
  8. AMK: Gives the ability to insert data manually if the user doesn’t want for any reason to insert them automatically.


IDE: Allows data capturing for whole pages, in the case we need to keep signatures or parts of the page unchanged.

UIZ Company undertakes projects and provides integrated information on the field of data capturing in GIS and remote sensing. You can find us at +49-30-20679130 or visit the UIZ data capturing webpage.