image slider

Geo-Coding - Capturing the
Spatial Location from Business Tabular Data

A number of options exist for the linking of a specific record in business (tabular or database) data to a physical spatial location (GPS coordinate). The process is referred to as Geocoding. The appropriate method of achieving this depends on many factors such as the type of available data, as well as possible confidentiality issues which might exclude the use of open public-domain data.

For one specific project the Dutch Reformed Church of Moreleta Park had a need to strategically look at the spatial location of their congregation members, in order to enable them to do better planning and decision making. Having the necessary GIS and database skills, SiQ offered their assistance in this regard.

The confidentiality of the data was of high importance and thus the process was completed inhouse without using web-based systems. In order to capture the spatial location of each congregation member it was necessary to source some base datasets from the local municipality. These datasets included the erf boundaries, as well as the street address dataset. The latest member database was also sourced from the church.

The records in the member database had to be linked to a corresponding address point in the street address dataset. Because the datasets didn’t correspond in many ways, it was necessary to embark on a thorough cleaning and quality control exercise, involving both datasets. Both datasets needed to be cleaned for the following reasons:

  • The addresses in the member database were captured in Afrikaans, whilst the addresses in the street address dataset of the municipality, were only available in English. Therefore many items in the street addresses in the municipal dataset had to be translated into Afrikaans, for example the word "street" had to be translated to "straat".
  • There was a lot of inconsistency in the way that the addresses were captured in the member database. It was therefore necessary to convert all the addresses in the database to the same format.

Because of the high number of records to be cleaned, and to ensure repeatability of the cleaning process, it was decided that the datasets would be imported into a SQL Server database and the cleaning processes handled with SQL scripts. Within this environment, even individual records that needed to be changed were handled within a script. The benefit of this approach is that cleaning updated member databases becomes a highly repeatable process, which is hugely beneficial with regards to the sustainability of the project.

After both datasets where cleaned, the records in the member dataset were matched (where possible) to the appropriate street address in the municipal dataset. The final step in the process was to create a spatial dataset using the X, Y coordinates of the linked data. In order to enable the relevant church personnel to view the spatial data, it was provided within an ArcReader document, which enabled them to view the data and do basic queries, as well as print maps.

"The benefit of this approach is that cleaning updated member databases becomes a highly repeatable process, which is hugely beneficial with regards to the sustainability of the project."