Cookbook-type snippets to export data out of HBase or import dumps into HBase.
Export a table from HBase into local filesystem:
bin/hbase org.apache.hadoop.hbase.mapreduce.Driver export \
table_name /local/path
Export a table from HBase into HDFS:
bin/hbase org.apache.hadoop.hbase.mapreduce.Driver export \
table_name hdfs://namenode/path
Import a table from a local dump into existing HBase table:
bin/hbase org.apache.hadoop.hbase.mapreduce.Driver import \
table_name /local/path
It’s a good idea to count and compare number of rows before exporting and after importing:
bin/hbase org.apache.hadoop.hbase.mapreduce.Driver \
rowcounter table_name
Number of rows is visible in Hadoop counter called ROWS, like in output below:
mapred.JobClient: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
mapred.JobClient: ROWS=103821
Alternatively, you can use Hadoop Tool interface, but it may complain about missing classes if hadoop-env.sh is not configured properly. For example, when launched without arguments, it displays available options:
hadoop jar hbase-0.20.3.jar
An example program must be given as the first argument.
Valid program names are:
export: Write table data to HDFS.
hsf2sf: Bulk convert 0.19 HStoreFiles to 0.20 StoreFiles
import: Import data written by Export.
rowcounter: Count rows in HBase table
HBase dump is one ore more Hadoop SequenceFiles, you can inspect its contents with something like:
hadoop fs -fs local -text table_name/part-m-00000