BigData « darkblue blog

Archive of posts filed under the BigData category.

PostgreSQL 9.3 plus Hadoop File System

23 September 2013, 5:40 pm

It seems that things just got a little bit more interesting with the release of Pg 9.3

Filed under BigData, Postgres.

Comments Off

Data Characterization and the Live

8 September 2013, 1:26 pm

People may already know about the OSGeo Live project. Its a great base as a VM since a) it is stable and very well tested, and b) it has much software installed, but in a way that is transparent through install scripts, so customization is as straightforward as it gets.. I was faced with a […]

Filed under BigData, OSGeo Live, Postgres.

Comments Off

AmpCamp 3 – HDFS

3 September 2013, 9:10 am

As described in a previous post, I used Cloudera .debs to install Hadoop/HDFS on an Ubuntu 12.04 ‘Precise’ single node. Now, to put some data into the HDFS system and use it. (note- this did not work in a 32bit VM) The Hadoop/HDFS install consisted of two steps: obtain and install .deb cdh4-repository, which enables […]

Filed under BigData.

Comments Off

AmpCamp 3 – The Stack

1 September 2013, 2:49 pm

The Berkeley Data Analytics Stack (BDAS) was the central subject at AmpCamp 3. Spark is the core of the stack. It has been recently adopted for incubation as an Apache Project. True to form for a fast-moving OSS project, we actually used the 0.80 git repo version, rather than the 0.73 that you will find […]

Filed under BigData.

Comments Off

AmpCamp 3 – Intro

1 September 2013, 1:48 pm

I cannot say enough about AmpCamp 3, a two day workshop at UC Berkeley that has just completed. The Berkeley AMP Lab (Algorithms, Machines and People) put on another great Open Source community building event, with state-of-the-art tech, precision execution, and the sort of fun that comes from a job well done. Many interesting people, […]

Filed under BigData.

Comments Off

Sizing Up California – Homes

12 August 2013, 11:58 am

I live in California, and it’s a big place. I was reviewing some records regarding residential homes. Using some simple stats, I broke the records into partitioned tables in PostgreSQL by county, and then let the rest fall into a general bucket. There is no one correct answer for this kind of analysis setup, but […]

Filed under BigData, Postgres.

Comments Off

darkblue blog

PostgreSQL 9.3 plus Hadoop File System

Data Characterization and the Live

AmpCamp 3 – HDFS

AmpCamp 3 – The Stack

AmpCamp 3 – Intro

Sizing Up California – Homes

Recent Posts

Archives

Categories

Blogroll

Meta