Skip to content

PostGIS Geocoder behavior

I became interested in the practical behavior of the

    PostGIS TIGER 2010 Geocoder

when I started doing statistical profiling of the results of geocoding 100,000 standard addresses in my county and got results that looked like what you see in the image at right. That inquiry morphed into this short paper PDF.

Why statistical profiling? I was experimenting with geocoding a subset of Alameda County parcel records as an example data set after sitting in on most of a graduate Data Mining class at UC Berkeley, Spring 2011. In data mining, any set of attributes that can be reduced to a ‘distance’ measure can be characterized in various ways. With geo-coordinates, distance is obvious, but lexical or numerical relationships can be summarized as ‘distance’ also. Examples of all three occur in geocoding and its data sets.