Skip to content

Extract and Sum Classified Areas by Grid Cell

PostgreSQL_logo.3colors.120x120

Given a set of classified polygons, in this example classified into three categories by name, apply a reference grid, and sum the areas for each classification into float fields in a destination table. This code contains test harness pieces in it also, so that a smaller example can be executed and verified, before applying the process to large sets.

drop table if exists ca_lt1 cascade; --test_geom_out;
create table ca_lt1 --test_geom_out
as    select 
  g.gridcode, 
  g.the_geom,
  SUM(d.acres_urban) as acres_urban,
  SUM(d2.acres_constrained) as acres_constrained,
  SUM(d3.acres_greenfield) as acres_greenfield
FROM 
  grid150m_uniq g
  --test_grid g
LEFT JOIN 
( select 
     sum(st_area( st_intersection( land.wkb_geometry, g2.the_geom))) as acres_urban,
     g2.gridcode
  from 
     grid150m_uniq g2, --test_geom land
     ca_landtypes_df land
   where st_intersects( land.wkb_geometry, g2.the_geom) AND
     land.landtype = 'Urban' 
     AND st_intersects( st_geomfromEWKT('SRID=3310;POLYGON((-147472.090  71129.874, -113634.730  71129.874,   -113634.730  50253.940,   -147472.090 50253.940,  -147472.090 71129.874))'), g2.the_geom)
  group by g2.gridcode
   ) d
ON ( g.gridcode = d.gridcode )

LEFT JOIN 
( select 
     sum(st_area( st_intersection( land.wkb_geometry, g2.the_geom))) as acres_constrained,
     g2.gridcode
  from 
     grid150m_uniq g2, --test_geom land
     ca_landtypes_df land
   where st_intersects( land.wkb_geometry, g2.the_geom) AND
     land.landtype = 'Constrained' 
     AND st_intersects( st_geomfromEWKT('SRID=3310;POLYGON((-147472.090  71129.874, -113634.730  71129.874,   -113634.730  50253.940,   -147472.090 50253.940,  -147472.090 71129.874))'), g2.the_geom)
  group by g2.gridcode
   ) d2
ON ( g.gridcode = d2.gridcode )

LEFT JOIN 
( select 
     sum(st_area( st_intersection( land.wkb_geometry, g2.the_geom))) as acres_greenfield,
     g2.gridcode
  from 
     grid150m_uniq g2, --test_geom land
     ca_landtypes_df land
   where st_intersects( land.wkb_geometry, g2.the_geom) AND
     land.landtype = 'Greenfield' 
     AND st_intersects( st_geomfromEWKT('SRID=3310;POLYGON((-147472.090  71129.874, -113634.730  71129.874,   -113634.730  50253.940,   -147472.090 50253.940,  -147472.090 71129.874))'), g2.the_geom)
  group by g2.gridcode
   ) d3
ON ( g.gridcode = d3.gridcode )

  WHERE st_intersects( st_geomfromtext('SRID=3310;POLYGON((-147472.090  71129.874, -113634.730  71129.874,   -113634.730  50253.940,   -147472.090 50253.940,  -147472.090 71129.874))'), g.the_geom)

GROUP BY 
  g.gridcode,
  g.the_geom;


sandag_landtypes_sm

sandag_minus_parcels3

p_gridded


 

Early UrbanFootprint Grid Development

a few variations on grid visualization:

6-placetyped_constraints_perc

1-set_gridded_analysis

2-set_gridded_load_constraint

4-set_gridded_constraints

5-set_gridded_constraints_perc

percentages_c

Super Double-Box Search-Off

pg_logo

“A quick estimate of performance of PostGIS 2 with
geometry bounding boxes as doubles vs as floats”

All tests performed on Linux 2.6.38 x86_64
PostgreSQL 9.0.5
shared_buffers = 2400MB
temp_buffers = 64MB
work_mem = 512MB
maintenance_work_mem = 200MB

The Test – TIGER 2010 California

TIGER provides many possibilities for tests.. The one I chose here
uses PLACE (a large’ish MULTIPOLY) and EDGE (MULTILINE)

A random sampling of 100 EDGES in California shows the following characteristics:

select st_npoints(the_geom) from ca_edges order by random() limit 100;
——————————-
min,median,mean,max stdev
2, 5.15, 16.333, 212 34.23

sorted, the first dozen entries are 2, the last are the following
[ … 36,37,43,44,51,57,58,68,68,79,206,206,212]

I reasoned that although many EDGES are trivially short, there
exist substantial numbers of non-trivial EDGES, so they would be ok for a test.

Next, pick some large PLACE geometries..
From the Dozen largest (by area) PLACE rows, I picked 4 well-known cities

now, the test (with indexes in place I trust)

———-
select count(*) from ca_edges where st_intersects( the_geom,
(select st_envelope(the_geom) from ca_place where name = 'City_Name') );

———-

Results:

place_name,trunk_cnt,trunk_time,dblbox_cnt,dblbox_time
Bakersfield,39691,24.230,39691,25.275
Fresno,36527,24.217,36527,25.194
Oakland,44404,24.395,44404,25.680
Lancaster,16566,23.983,16566,25.194

preliminary conclusions

the search times were almost the same for each of the 4 cases
and I did not detect change with multiple runs of the same query
(though I could have tested that more, too)

search times were fast enough such that I believe indexes are being used
(I did not check all indexes carefully, but the database setup was the same
for both test contexts, so if they missed something, both tests likely missed
in the same way)

the result counts for both trunk and dblbox match, as a sanity check

performance hit is around 4% in each case

=======

caveat: I could have misjudged and chosen a poor test case for some
obscure-to-me reason.. hopefully, this is useful and informative