From b2e302d0ba023f2f9f90ef35f73e2f7d5351f195 Mon Sep 17 00:00:00 2001 From: Chris Larsen Date: Fri, 8 Apr 2016 11:50:15 -0700 Subject: [PATCH] [hbase10] Add missing `durability` documentation to readme. Add documentation on how to use the hbase10 client with Google's Bigtable. --- hbase10/README.md | 99 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 96 insertions(+), 3 deletions(-) diff --git a/hbase10/README.md b/hbase10/README.md index 1da5bc43..dd01249e 100644 --- a/hbase10/README.md +++ b/hbase10/README.md @@ -1,5 +1,5 @@ # HBase (1.0.x) Driver for YCSB -This driver is a binding for the YCSB facilities to operate against a HBase 1.0.x Server cluster. +This driver is a binding for the YCSB facilities to operate against a HBase 1.0.x Server cluster or Google's hosted Bigtable. To run against an HBase 0.94.x cluster, use the `hbase094` binding. To run against an HBase 0.98.x cluster, use the `hbase098` binding. -See `hbase098/README.md` for configuration details. +See `hbase098/README.md` for a quickstart to setup HBase for load testing and common configuration details. + +## Configuration Options +In addition to those options available for the `hbase098` binding, the following options are available for the `hbase10` binding: + +* `durability`: Whether or not writes should be appended to the WAL. Bypassing the WAL can improve throughput but data cannot be recovered in the event of a crash. The default is true. + +## Bigtable + +Google's Bigtable service provides an implementation of the HBase API for migrating existing applications. Users can perform load tests against Bigtable using this binding. + +### 1. Setup a Bigtable Cluster + +Login to the Google Cloud Console and follow the [Creating Cluster](https://cloud.google.com/bigtable/docs/creating-cluster) steps. Make a note of your cluster name, zone and project ID. + +### 2. Launch the Bigtable Shell + +From the Cloud Console, launch a shell and follow the [Quickstart](https://cloud.google.com/bigtable/docs/quickstart) up to step 4 where you launch the HBase shell. + +### 3. Create a Table + +For best results, use the pre-splitting strategy recommended in [HBASE-4163](https://issues.apache.org/jira/browse/HBASE-4163): + +``` +hbase(main):001:0> n_splits = 200 # HBase recommends (10 * number of regionservers) +hbase(main):002:0> create 'usertable', 'cf', {SPLITS => (1..n_splits).map {|i| "user#{1000+i*(9999-1000)/n_splits}"}} +``` + +Make a note of the column family, in this example it's `cf``. + +### 4. Fetch the Proper ALPN Boot Jar + +The Bigtable protocol uses HTTP/2 which requires an ALPN protocol negotiation implementation. On JVM instantiation the implementation must be loaded before attempting to connect to the cluster. If you're using Java 7 or 8, use this [Jetty Version Table](http://www.eclipse.org/jetty/documentation/current/alpn-chapter.html#alpn-versions) to determine the version appropriate for your JVM. (ALPN is included in JDK 9+). Download the proper jar from [Maven](http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.mortbay.jetty.alpn%22%20AND%20a%3A%22alpn-boot%22) somewhere on your system. + +### 5. Download the Bigtable Client Jar + +Download one of the `bigtable-hbase-1.#` jars from [Maven](http://search.maven.org/#search%7Cga%7C1%7Ccom.google.cloud.bigtable) to your host. + +### 6. Download JSON Credentials + +Follow these instructions for [Generating a JSON key](https://cloud.google.com/bigtable/docs/installing-hbase-shell#service-account) and save it to your host. + +### 7. Create or Edit hbase-site.xml + +If you have an existing HBase configuration directory with an `hbase-site.xml` file, edit the file as per below. If not, create a directory called `conf` under the `hbase10` directory. Create a file in the conf directory named `hbase-site.xml`. Provide the following settings in the XML file, making sure to replace the bracketed examples with the proper values from your Cloud console. + +``` + + + hbase.client.connection.impl + com.google.cloud.bigtable.hbase1_0.BigtableConnection + + + google.bigtable.cluster.name + [YOUR-CLUSTER-ID] + + + google.bigtable.project.id + [YOUR-PROJECT-ID] + + + google.bigtable.zone.name + [YOUR-ZONE-NAME] + + + google.bigtable.auth.service.account.enable + true + + + google.bigtable.auth.json.keyfile + [PATH-TO-YOUR-KEY-FILE] + + +``` + +If you wish to try other API implementations (1.1.x or 1.2.x) change the `hbase.client.connection.impl` appropriately to match the JAR you downloaded. + +If you have an existing HBase config directory, make sure to add it to the class path via `-cp :`. + +### 8. Execute a Workload + +Switch to the root of the YCSB repo and choose the workload you want to run and `load` it first. With the CLI you must provide the column family, cluster properties and the ALPN jar to load. + +``` +bin/ycsb load hbase10 -p columnfamily=cf -cp -jvm-args='-Xbootclasspath/p:' -P workloads/workloada + +``` + +The `load` step only executes inserts into the datastore. After loading data, run the same workload to mix reads with writes. + +``` +bin/ycsb run hbase10 -p columnfamily=cf -jvm-args='-Xbootclasspath/p:' -P workloads/workloada + +``` \ No newline at end of file