{"id":23,"date":"2012-03-31T15:23:05","date_gmt":"2012-03-31T22:23:05","guid":{"rendered":"http:\/\/blog.light42.com\/wordpress\/?p=23"},"modified":"2013-07-07T13:44:20","modified_gmt":"2013-07-07T20:44:20","slug":"gdal_retile-retooled","status":"publish","type":"post","link":"http:\/\/blog.light42.com\/wordpress\/?p=23","title":{"rendered":"gdal_retile retooled"},"content":{"rendered":"<p>Recently the task of combining and tiling some large GeoTIFFs came up.. so began an investigation of adding threading support to <strong>gdal_retile.py<\/strong><\/p>\n<p>Unfortunately, it is widely known that python is not the best environment for multithreaded work, due to the <a title=\"GIL\" href=\"http:\/\/docs.python.org\/glossary.html#term-global-interpreter-lock\" target=\"_blank\">global interpreter lock (GIL)<\/a>. However, while building threading support for some routine postgis tasks recently, I had quickly found that dispatching a series of computationally intensive PostGIS tasks via <a title=\"psycopg2\" href=\"http:\/\/initd.org\/psycopg\/docs\/\" target=\"_blank\">psycopg2<\/a> in python <a title=\"python threading\" href=\"http:\/\/docs.python.org\/library\/threading.html\" target=\"_blank\">threading<\/a> can work quite well. In that case the bulk of the work is done by a large, external engine (PostGIS) and python just acts as a brain to the brawn. I started on this gdal_retile.py experiment with an idea that CPU intensive parts of a gdal_retile.py job, perhaps <strong>color-space conversion<\/strong> and <strong>JPEG compression<\/strong>, might tip a balance such that the python threading itself was no longer a bottleneck.<\/p>\n<p>I wanted to find a straightforward threading pattern in python that was suitable for breaking up gdal_retile.py&#8217;s inner loops.<\/p>\n<h4>A Threading Pattern in Python<\/h4>\n<p><img decoding=\"async\" src=\"wp-content\/uploads\/2012\/03\/simple_threads_scrn2.png\" alt=\"\" align=\"right\" \/> After some experimentation, this program worked well enough.<\/p>\n<ul>\n<li><strong>Thread_Tiny<\/strong> is a threading.Thread subclass, whose <code>__init__<\/code> function takes two Queue objects as arguments.<\/li>\n<li><strong>thread_queue<\/strong> maintains the threads themselves<\/li>\n<li><strong>results_queue<\/strong> where each thread pushes its result before being disposed<\/li>\n<li><strong>tD<\/strong> a dict containing anything the main program wants to pass to a new Thread<\/li>\n<li>put() an args dict into thread_queue. A live thread will pick it up. If all threads are busy working, wait internally until a slot becomes available before adding the queue element<\/li>\n<li>app waits for all threads to complete<\/li>\n<\/ul>\n<p>This construct seemed sufficiently complete to be a template for breaking up a moderately complex utility like gdal_retile.py<\/p>\n<h4><\/h4>\n<h4>Reworking gdal_retile.py<\/h4>\n<p><img decoding=\"async\" src=\"wp-content\/uploads\/2012\/03\/gdal_retile_snip0.png\" alt=\"\" align=\"right\" \/> I found that there were really two places that an inner loop repetition was being done. Once where the input was being split into tiles as <code>TileResult_0<\/code>, and secondly when building a plane of a pyramid <code>TileResult_N<\/code>. (in <strong>gdal_retile.py<\/strong> either operation can be omitted)<\/p>\n<p>Solution: A utility function <strong>Count_Cores()<\/strong> returns the number of cores in the Linux operating environment. Define a thread_queue and result_queue \u2013 each instances of Queue.Queue \u2013 for each case. A difference is that a thread_queue is created such that there is <em>only one slot per desired active process<\/em>. This way the thread_queue runs Threads in an orderly way.\u00a0The number of active Threads is limited to &#8220;cores minus two&#8221; or one by limiting the number of slots available in a new thread_queue.\u00a0 Instead of possibly clogging execution with an arbitrarily large set of new Threads to execute at initialization time, only <em>N<\/em> threads can run at any given time.<\/p>\n<p>Next, a basic unit of work has to be defined that can be turned over to Threads. In <code>tileImage()<\/code> there is a call to <code>createTile()<\/code>, which takes as arguments: a local object called <strong>mosaic_info<\/strong>; offsets X,Y; height and width; a tilename; and an object representing an <strong>OGR Datasource<\/strong>. That datasource is open and has state, and therefore is not shareable. However I discovered that inside <code>createTile()<\/code> the only thing the OGR datasource is used for is a single, write-only append of a feature describing the output tile&#8217;s BBOX and filename. What if a result_queue replaced the non-shareable datasource passed into <code>createTile()<\/code> and each thread simply pushes its result into the result_queue ?\u00a0 When all threads have completed, iterate and retrieve each thread result, adding each to the OGRDS data source as a feature. <em>Here is some code&#8230;<\/em><\/p>\n<p><img decoding=\"async\" src=\"wp-content\/uploads\/2012\/03\/gdal_retile_snip1.png\" alt=\"\" align=\"right\" \/><\/p>\n<h4>A Second Comparison<\/h4>\n<p>Fortunately, while discussing this on <code>#gdal<\/code> <a href=\"http:\/\/twitter.com\/amatix\" target=\"_blank\">Robert Coup<\/a> (<strong>rcoup<\/strong>) happened to be online. In his organization there was another attempt at making <strong>gdal_retile.py<\/strong> parallel too, but the results were not satisfying. rcoup tossed the code over, and I ran that as well.<\/p>\n<h4>Some Results?<\/h4>\n<p><code>time gdal_retile.py -s_srs EPSG:900913 -co \"TILED=YES\" -co \"COMPRESS=JPEG\" -ps 1024 1024 -r \"bilinear\" -co PHOTOMETRIC=YCBCR -targetDir out_dir_stock Fresno_city_900913.tif<br \/>\n<\/code><\/p>\n<p>In the end, the version rcoup sent, the threaded version described here, and the stock version, all ran against a 77,000 pixel x 84,000 pixel GeoTIFF (18G, three bands) in about the same 50 minutes each on a local machine (producing a matching 6309 tiles). I suspect that something in the <strong>GDAL<\/strong> python interfaces (aside from the GIL) is preventing real parallelism from kicking in, though there are a lot of corners for problems to hide in just this small example.<\/p>\n<p>Long ago, <strong>hobu<\/strong> had <a href=\"http:\/\/lists.osgeo.org\/pipermail\/gdal-dev\/2008-April\/016718.html\" target=\"_blank\">this to say<\/a> <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently the task of combining and tiling some large GeoTIFFs came up.. so began an investigation of adding threading support to gdal_retile.py Unfortunately, it is widely known that python is not the best environment for multithreaded work, due to the global interpreter lock (GIL). However, while building threading support for some routine postgis tasks recently, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"_links":{"self":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/23"}],"collection":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=23"}],"version-history":[{"count":93,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/23\/revisions"}],"predecessor-version":[{"id":1207,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/23\/revisions\/1207"}],"wp:attachment":[{"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=23"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=23"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/blog.light42.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=23"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}