OpenTSDB

Setup HBase

In order to use OpenTSDB, you need to have HBase up and running. This page will help you get started with a simple, single-node HBase setup, which is good enough to evaluate OpenTSDB or monitor small installations. If you need scalability and reliability, you will need to setup a full HBase cluster.

You can copy-paste all the following instructions directly into a terminal.

Setup a single-node HBase instance

If you already have an HBase cluster, skip this step. If you're gonna be using less than 5-10 nodes, stick to a single node. Deploying HBase on a single node is easy and can help get you started with OpenTSDB quickly. You can always scale to a real cluster and migrate your data later.

wget http://www.apache.org/dist/hbase/hbase-0.98.10.1/hbase-0.98.10.1-hadoop1-bin.tar.gz tar xfz hbase-0.98.10.1-hadoop1-bin.tar.gz cd hbase-0.98.10.1-hadoop1
At this point, you are ready to start HBase (without HDFS) on a single node. But before starting it, I recommend using the following configuration:
hbase_rootdir=${TMPDIR-'/tmp'}/tsdhbase iface=lo`uname | sed -n s/Darwin/0/p` cat >conf/hbase-site.xml <<EOF <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hbase.rootdir</name> <value>file:///$hbase_rootdir/hbase-\${user.name}/hbase</value> </property> <property> <name>hbase.zookeeper.dns.interface</name> <value>$iface</value> </property> <property> <name>hbase.regionserver.dns.interface</name> <value>$iface</value> </property> <property> <name>hbase.master.dns.interface</name> <value>$iface</value> </property> </configuration> EOF
Make sure to adjust the value of hbase_rootdir if you want HBase to store its data in somewhere more durable than a temporary directory. The default is to use /tmp, which means you'll lose all your data whenever your server reboots. The remaining settings are less important and simply force HBase to stick to the loopback interface (lo0 on Mac OS X, or just lo on Linux), which simplifies things when you're just testing HBase on a single node.

Now start HBase:

./bin/start-hbase.sh

Using LZO

There is no reason to not use LZO with HBase. Except in rare cases, the CPU cycles spent on doing LZO compression / decompression pay for themselves by saving you time wasted doing more I/O. This is certainly true for OpenTSDB where LZO can easily compress OpenTSDB's binary data by 3 to 4x. Installing LZO is simple and is done as follows.

Pre-requisites

In order to build hadoop-lzo, you need to have Ant installed as well as liblzo2 with development headers:
apt-get install ant liblzo2-dev # Debian/Ubuntu yum install ant ant-nodeps lzo-devel.x86_64 # RedHat/CentOS/Fedora brew install lzo # Mac OS X

Compile & Deploy

Thanks to our friends at Cloudera for maintaining the Hadoop-LZO package:
git clone git://github.com/cloudera/hadoop-lzo.git cd hadoop-lzo CLASSPATH=path/to/hadoop-core-1.0.4.jar CFLAGS=-m64 CXXFLAGS=-m64 ant compile-native tar hbasedir=path/to/hbase mkdir -p $hbasedir/lib/native cp build/hadoop-lzo-0.4.14/hadoop-lzo-0.4.14.jar $hbasedir/lib cp -a build/hadoop-lzo-0.4.14/lib/native/* $hbasedir/lib/native
Restart HBase and make sure you create your tables with COMPRESSION => 'LZO'

Common gotchas:

Migrating to a real HBase cluster

TBD. In short:

Putting HBase in production

TBD. In short: