Skip to content
Archive of posts filed under the Postgres category.

Sizing Up California – Homes

I live in California, and it’s a big place. I was reviewing some records regarding residential homes. Using some simple stats, I broke the records into partitioned tables in PostgreSQL by county, and then let the rest fall into a general bucket. There is no one correct answer for this kind of analysis setup, but […]

PostgreSQL banter on ‘record pseudotype’

dmg_83 Okay. I notice that if you a table like CREATE TABLE tbl (x int, y int) and do SELECT (tbl).x, (tbl).y FROM tbl WHERE tbl = ROW(1,2) you’ll indeed get back 1, 2 (assuming such a row exists). That seems a bit weird, since the left side of the expression “tbl = ROW(1,2)” does […]

PostgreSQL banter on ‘with’

RhodiumToad with foo(x) as (values (‘blah blah blah’)) insert into wherever select x from foo; merlinm or, with vars as (select ‘a’ as a, 2 as b) insert into foo values (select a from vars), select b from vars); merlinm oops, missing a parent there merlinm gah RhodiumToad with vars(a,b) as (values (‘a’, 2)) … […]

CTE to Remove Duplicates

Consider for a moment, a table of records including addresses and data, such that most every address occurs once, but for unknown reasons, also contains more than one duplicate address (656 in 55,279 below). How to remove duplicate addresses? let us count the ways.. Here is one: copy (  select a_name, address, city, state, zip, […]

Slice a Polygon by a Grid

Sometimes there are polygons with too many vertices to process comfortably. For that or other reasons, you may want to cut one or more polygons with a reference grid. Here is an example using the US 2010 Census and a set of grid cells in EPSG:3310: ——————————————————————– — take a single census place polygon and […]

PostGIS 2.0 Intro Presentation

at UC Berkeley Soda Hall, for the San Francisco PostgreSQL Users Group and again at the College of Natural Resources for GIS Day 2012 main slides and dustymugs’ raster slides errata: * The ST_MakeValid(geom) example could have been expanded to include ST_IsValidDetail() * It is arguably wrong to say that Lat/Lng (EPSG:4326) data is one […]

KNN Search Speed Test

SYNOPSIS: Using one million randomly generated points, time finding the ten nearest points to the center of the range by using the KNN distance operator versus using a call to PostGIS ST_Distance(). ## Use ST_Distance() explain analyze SELECT ST_Distance(the_geom, ST_GeomFromText(‘POINT(500 500)’)) FROM tgeom ORDER BY ST_Distance(the_geom, ST_GeomFromText(‘POINT(500 500)’)) asc limit 10; Total runtime: 1568.499 ms […]

Check for Validity

SELECT name, state_name, fips, st_summary(the_geom), st_isvalidreason(the_geom) FROM usa_counties WHERE not st_isvalid(the_geom); NOTICE: Ring Self-intersection at or near point -70.82466 42.26050 -[ RECORD 1 ]—-+————————————————- name | Plymouth state_name | Massachusetts fips | 25023 st_summary | MultiPolygon[B] with 1 elements | Polygon[] with 1 rings | ring 0 has 31 points st_isvalidreason | Ring Self-intersection[-70.8246609058028 42.2605069326982] […]

Distance to Transit in SACOG

In California, there are multi-county, regional government agencies that perform some functions like planning public transportation systems. The capital city of California is Sacramento, and the regional council is called the Sacramento Area Council of Governments (SACOG). Using a database of train and light rail stations in a projection where distance measurement is accurate, and […]

plr median

—————————————– install ———— create or replace function r_median(_float8) returns float as ‘ median(arg1) ‘ language ‘plr’; CREATE AGGREGATE median ( sfunc = plr_array_accum, basetype = float8, stype = _float8, finalfunc = r_median ); —————————————– test ————— select max(a.res),avg(a.res),median(a.res),min(a.res) from ( select (st_area(wkb_geometry) – shape_area) as res from landtypes_090911_marin_parcel_adjusted) a; thuban1 Time: 21255 ms r900 Time: […]