{"id":897,"date":"2012-11-07T14:58:51","date_gmt":"2012-11-07T22:58:51","guid":{"rendered":"http:\/\/blog.light42.com\/wordpress\/?p=897"},"modified":"2013-09-18T17:25:44","modified_gmt":"2013-09-19T00:25:44","slug":"knn-search-speed-test","status":"publish","type":"post","link":"http:\/\/blog.light42.com\/wordpress\/?p=897","title":{"rendered":"KNN Search Speed Test"},"content":{"rendered":"<p><img decoding=\"async\" src=\"http:\/\/blog.light42.com\/wordpress\/wp-content\/uploads\/2012\/06\/PostgreSQL_logo.3colors.120x120.png\" alt=\"pg_logo\" align=\"right\" height=108px width=108px \/><br \/>\n<strong>SYNOPSIS:<\/strong> Using one million randomly generated points, time finding the ten nearest points to the center of the range by using the <strong>KNN distance operator<\/strong> versus using a call to PostGIS <strong>ST_Distance()<\/strong>.<\/p>\n<hr \/>\n<h4>## Use ST_Distance()<\/h4>\n<pre><code>explain analyze \r\n  SELECT ST_Distance(the_geom, ST_GeomFromText('POINT(500 500)')) \r\n  FROM tgeom ORDER BY ST_Distance(the_geom, ST_GeomFromText('POINT(500 500)')) \r\n    asc limit 10; \r\n<\/code><\/pre>\n<p><strong>Total runtime: 1568.499 ms<\/strong><\/p>\n<pre><code> QUERY PLAN                                                            \r\n--------------------------------------------------------\r\n Limit  (cost=289943.64..289943.67 rows=10 width=128) (actual time=1568.452..1568.457 rows=10 loops=1)\r\n   ->  Sort  (cost=289943.64..292443.64 rows=1000000 width=128) (actual time=1568.444..1568.447 rows=10 loops=1)\r\n         Sort Key: (st_distance(the_geom, '01010000000000000000407F400000000000407F40'::geometry))\r\n         Sort Method: top-N heapsort  Memory: 25kB\r\n         ->  Seq Scan on tgeom  (cost=0.00..268334.00 rows=1000000 width=128) (actual time=0.102..1268.725 rows=1000000 loops=1)\r\n<\/code><\/pre>\n<h4>## Use KNN Sorting<\/h4>\n<pre><code>explain analyze \r\n  SELECT the_geom < -> ST_GeomFromText('POINT(500 500)') \r\n  FROM tgeom ORDER BY the_geom < -> ST_GeomFromText('POINT(500 500)') asc limit 10;\r\n<\/code><\/pre>\n<p><strong> Total runtime: 1.847 ms<\/strong><br \/>\n<code><\/p>\n<pre> QUERY PLAN                                                            \r\n--------------------------------------------------------\r\n\r\n Limit  (cost=0.00..0.74 rows=10 width=128) (actual time=1.476..1.754 rows=10 loops=1)\r\n   ->  Index Scan using tgeom_idx on tgeom  (cost=0.00..74124.46 rows=1000000 width=128) (actual time=1.474..1.748 rows=10 loops=1)\r\n         Order By: (the_geom < -> '01010000000000000000407F400000000000407F40'::geometry)\r\n<\/pre>\n<p><\/code><\/p>\n<hr \/>\n<p><em><\/p>\n<h4>roughly 850 times faster !!<\/h4>\n<p><\/em><\/p>\n<hr \/>\n<p>Complete Setup Instructions for this Example:<br \/>\nBase System: <strong>debian 6.0 &#8216;squeeze&#8217;<\/strong><br \/>\nAdd Repositories: backports testing<\/p>\n<hr \/>\n<p> sudo apt-get -t squeeze-backports install build-essential postgresql-contrib-9.1 postgresql-client-9.1 postgresql-server-dev-9.1 postgresql-9.1 autoconf automake1.9 libtool flex bison gdb <\/p>\n<p>## setup postgres however you are used to working with it.. <\/p>\n<p> svn co http:\/\/svn.osgeo.org\/postgis\/branches\/2.0\/ postgis_20_branch<\/p>\n<p> svn co http:\/\/svn.osgeo.org\/geos\/branches\/3.3\/ geos_33_branch<\/p>\n<p> svn co http:\/\/svn.osgeo.org\/metacrs\/proj\/trunk\/proj proj_trunk<\/p>\n<p> sudo apt-get install libexpat1  libexpat1-dev libxslt1.1 libxml2-dev libjson0-dev libcurl4-openssl-dev python2.6-dev<\/p>\n<p> svn checkout http:\/\/libkml.googlecode.com\/svn\/trunk\/ libkml_trunk<\/p>\n<p> svn co http:\/\/svn.osgeo.org\/gdal\/branches\/1.9\/ gdal_19_branch<br \/>\n   .\/configure &#8211;with-python &#8211;with-curl &#8211;without-libtool<\/p>\n<p> <strong> ##  create database testdb  ## <\/strong><br \/>\npsql testdb -f postgis\/postgis.sql<br \/>\npsql testdb -f spatial_ref_sys.sql<br \/>\npsql testdb -f raster\/rt_pg\/rtpostgis.sql<\/p>\n<p>>psql<br \/>\npsql (9.1.6)<br \/>\nType &#8220;help&#8221; for help.<\/p>\n<p>testdb=# select postgis_full_version();<br \/>\n<strong>POSTGIS=&#8221;2.0.2SVN r10643&#8243; GEOS=&#8221;3.3.6dev-CAPI-1.7.6&#8243; PROJ=&#8221;Rel. 4.8.0, 6 March 2012&#8243; GDAL=&#8221;GDAL 1.9.2, released 2012\/10\/08&#8243; LIBXML=&#8221;2.7.8&#8243; LIBJSON=&#8221;UNKNOWN&#8221; RASTER<\/strong><\/p>\n<pre><code>CREATE TABLE tgeom (gid serial PRIMARY KEY,the_geom geometry);\r\nINSERT into tgeom(the_geom) select st_geomfromtext('POINT('|| ((random()*1000)::integer)::text || ' ' ||  ((random()*1000)::integer)::text || ')' ) from generate_series(1,1000000);\r\nCREATE INDEX tgeom_idx on tgeom using GIST(the_geom);\r\n<\/code><\/pre>\n<p>## Done Setup<\/p>\n<p>based on<br \/>\n <a href=\"http:\/\/www.depesz.com\/2010\/12\/11\/waiting-for-9-1-knngist\/\" title=\"www.depesz.com\/2010\/12\/11\/waiting-for-9-1-knngist\/\" target=\"_blank\">http:\/\/www.depesz.com\/2010\/12\/11\/waiting-for-9-1-knngist\/<\/a><\/p>\n<p>UPDATE for debian\/ubuntu<br \/>\n&nbsp;&nbsp; there is now a repository at apt.postgresql.org<\/p>\n","protected":false},"excerpt":{"rendered":"<p>SYNOPSIS: Using one million randomly generated points, time finding the ten nearest points to the center of the range by using the KNN distance operator versus using a call to PostGIS ST_Distance(). ## Use ST_Distance() explain analyze SELECT ST_Distance(the_geom, ST_GeomFromText(&#8216;POINT(500 500)&#8217;)) FROM tgeom ORDER BY ST_Distance(the_geom, ST_GeomFromText(&#8216;POINT(500 500)&#8217;)) asc limit 10; Total runtime: 1568.499 ms [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"_links":{"self":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/897"}],"collection":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=897"}],"version-history":[{"count":36,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/897\/revisions"}],"predecessor-version":[{"id":1381,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/897\/revisions\/1381"}],"wp:attachment":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=897"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=897"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=897"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}