Skip to content
 

KNN Search Speed Test

pg_logo
SYNOPSIS: Using one million randomly generated points, time finding the ten nearest points to the center of the range by using the KNN distance operator versus using a call to PostGIS ST_Distance().


## Use ST_Distance()

explain analyze 
  SELECT ST_Distance(the_geom, ST_GeomFromText('POINT(500 500)')) 
  FROM tgeom ORDER BY ST_Distance(the_geom, ST_GeomFromText('POINT(500 500)')) 
    asc limit 10; 

Total runtime: 1568.499 ms

 QUERY PLAN                                                            
--------------------------------------------------------
 Limit  (cost=289943.64..289943.67 rows=10 width=128) (actual time=1568.452..1568.457 rows=10 loops=1)
   ->  Sort  (cost=289943.64..292443.64 rows=1000000 width=128) (actual time=1568.444..1568.447 rows=10 loops=1)
         Sort Key: (st_distance(the_geom, '01010000000000000000407F400000000000407F40'::geometry))
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Seq Scan on tgeom  (cost=0.00..268334.00 rows=1000000 width=128) (actual time=0.102..1268.725 rows=1000000 loops=1)

## Use KNN Sorting

explain analyze 
  SELECT the_geom < -> ST_GeomFromText('POINT(500 500)') 
  FROM tgeom ORDER BY the_geom < -> ST_GeomFromText('POINT(500 500)') asc limit 10;

Total runtime: 1.847 ms

 QUERY PLAN                                                            
--------------------------------------------------------

 Limit  (cost=0.00..0.74 rows=10 width=128) (actual time=1.476..1.754 rows=10 loops=1)
   ->  Index Scan using tgeom_idx on tgeom  (cost=0.00..74124.46 rows=1000000 width=128) (actual time=1.474..1.748 rows=10 loops=1)
         Order By: (the_geom < -> '01010000000000000000407F400000000000407F40'::geometry)


roughly 850 times faster !!


Complete Setup Instructions for this Example:
Base System: debian 6.0 ‘squeeze’
Add Repositories: backports testing


sudo apt-get -t squeeze-backports install build-essential postgresql-contrib-9.1 postgresql-client-9.1 postgresql-server-dev-9.1 postgresql-9.1 autoconf automake1.9 libtool flex bison gdb

## setup postgres however you are used to working with it..

svn co http://svn.osgeo.org/postgis/branches/2.0/ postgis_20_branch

svn co http://svn.osgeo.org/geos/branches/3.3/ geos_33_branch

svn co http://svn.osgeo.org/metacrs/proj/trunk/proj proj_trunk

sudo apt-get install libexpat1 libexpat1-dev libxslt1.1 libxml2-dev libjson0-dev libcurl4-openssl-dev python2.6-dev

svn checkout http://libkml.googlecode.com/svn/trunk/ libkml_trunk

svn co http://svn.osgeo.org/gdal/branches/1.9/ gdal_19_branch
./configure –with-python –with-curl –without-libtool

## create database testdb ##
psql testdb -f postgis/postgis.sql
psql testdb -f spatial_ref_sys.sql
psql testdb -f raster/rt_pg/rtpostgis.sql

>psql
psql (9.1.6)
Type “help” for help.

testdb=# select postgis_full_version();
POSTGIS=”2.0.2SVN r10643″ GEOS=”3.3.6dev-CAPI-1.7.6″ PROJ=”Rel. 4.8.0, 6 March 2012″ GDAL=”GDAL 1.9.2, released 2012/10/08″ LIBXML=”2.7.8″ LIBJSON=”UNKNOWN” RASTER

CREATE TABLE tgeom (gid serial PRIMARY KEY,the_geom geometry);
INSERT into tgeom(the_geom) select st_geomfromtext('POINT('|| ((random()*1000)::integer)::text || ' ' ||  ((random()*1000)::integer)::text || ')' ) from generate_series(1,1000000);
CREATE INDEX tgeom_idx on tgeom using GIST(the_geom);

## Done Setup

based on
http://www.depesz.com/2010/12/11/waiting-for-9-1-knngist/

UPDATE for debian/ubuntu
   there is now a repository at apt.postgresql.org