Geo-Coding - Capturing the Spatial
Location from Business Tabular Data
A number of options exist for the linking of a specific record in business (tabular or database)
data to a physical spatial location (GPS coordinate). The process is referred to as Geocoding. The
appropriate method of achieving this depends on many factors such as the type of available data,
as well as possible confidentiality issues which might exclude the use of open public-domain
data.
For one specific project the Dutch Reformed Church of Moreleta Park had a need to strategically
look at the spatial location of their congregation members, in order to enable them to do better
planning and decision making. Having the necessary GIS and database skills, SiQ offered their assistance
in this regard.
The confidentiality of the data was of high importance and thus the process was completed inhouse
without using web-based systems. In order to capture the spatial location of each congregation
member it was necessary to source some base datasets from the local municipality. These
datasets included the erf boundaries, as well as the street address dataset. The latest member
database was also sourced from the church.
The records in the member database had to be linked to a corresponding address point in the
street address dataset. Because the datasets didn’t correspond in many ways, it was necessary to
embark on a thorough cleaning and quality control exercise, involving both datasets. Both datasets
needed to be cleaned for the following reasons:
- The addresses in the member database were captured in Afrikaans, whilst the addresses in
the street address dataset of the municipality, were only available in English. Therefore many
items in the street addresses in the municipal dataset had to be translated into Afrikaans, for
example the word "street" had to be translated to "straat".
- There was a lot of inconsistency in the way that the addresses were captured in the member
database. It was therefore necessary to convert all the addresses in the database to the same
format.
Because of the high number of records to be cleaned, and to ensure repeatability of the cleaning
process, it was decided that the datasets would be imported into a SQL Server database and the
cleaning processes handled with SQL scripts. Within this environment, even individual records
that needed to be changed were handled within a script. The benefit of this approach is that
cleaning updated member databases becomes a highly repeatable process, which is hugely beneficial
with regards to the sustainability of the project.
After both datasets where cleaned, the records in the member dataset were matched (where
possible) to the appropriate street address in the municipal dataset. The final step in the process
was to create a spatial dataset using the X, Y coordinates of the linked data.
In order to enable the relevant church personnel to view the spatial data, it was provided within
an ArcReader document, which enabled them to view the data and do basic queries, as well as
print maps.
|
|
"The benefit of this approach is
that cleaning updated member
databases becomes a highly repeatable
process, which is hugely
beneficial with regards to the sustainability
of the project."
|
|