Skip to content

AmpCamp 3 – HDFS


As described in a previous post, I used Cloudera .debs to install Hadoop/HDFS on an Ubuntu 12.04 ‘Precise’ single node. Now, to put some data into the HDFS system and use it. (note- this did not work in a 32bit VM)

The Hadoop/HDFS install consisted of two steps: obtain and install .deb cdh4-repository, which enables a suite of other packages (perhaps auto-magically updated); then use those packages to install the features you want.

apt-get update shows the Cloudera repository in the list.

Hit precise-cdh4 Release.gpg
Hit precise-cdh4 Release
Hit precise-cdh4/contrib Sources
Hit precise-cdh4/contrib amd64 Packages
Ign precise-cdh4/contrib TranslationIndex
Ign precise-cdh4/contrib Translation-en_US
Ign precise-cdh4/contrib Translation-en

Packages look like this:

Some quick reading shows I have at least two choices to easily interface to HDFS. One is the hue suite, and another is an HDFS Fuse interface. More options are listed on this MountableHDFS page.

Cloudera supplies an HDFS Fuse mount with their system. Instructions on how to use the FUSE extension are here.

When HDFS is running with this system (with or without FUSE) you can view a web interface at port 50070. A port list here.

after mucking with the environment a bit

Exception in thread "main" java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS):

I was getting an error when trying to start the secondarynamenode, that the namenode had an invalid address. so, investigating:

update-alternatives --get-selections | grep hadoop
hadoop-conf auto /etc/hadoop/conf.empty

less /etc/hadoop/conf.empty/core-site.xml

core-site.xml turned out to be missing the actual namenode name,
which was not apparent in the doc I was reading.. So, edit
/etc/hadoop/conf.empty/core-site.xml and add a name and value property xml tags. The system starts up after this change.