Skip to main content

Performance Testing / Benchmarking a Hbase Cluster

Performance testing / Benchmarking a Hbase cluster
So you have setup a new Hbase cluster, and want to 'take it for a spin'.  Here is how, without writing a lot of code on your own.

BEFORE WE START

I like to have hbase command available in my PATH.  I put the following in my   ~/.bashrc  file:

export HBASE_HOME=/hadoop/hbase
export PATH=$PATH:$HBASE_HOME/bin

A) HBASE PERFORMANCEEVALUATION

class : org.apache.hadoop.hbase.PerformanceEvaluation
jar : hbase-*-tests.jar

This is a handy class that comes with the distribution.   It can do read/writes to hbase.   It spawns a map-reduce job to do the reads / writes in parallel.   There is also an option to do the operations in threads instead of map-reduce.

lets find out the usage:

# hbase org.apache.hadoop.hbase.PerformanceEvaluation

Usage: java org.apache.hadoop.hbase.PerformanceEvaluation \
  [--miniCluster] [--nomapred] [--rows=ROWS] <command> <nclients>
....
[snipped]
...

So lets run a randomWrite test:

# time hbase org.apache.hadoop.hbase.PerformanceEvaluation  randomWrite 5

we are running 5 clients.  By default, this would be running in map reduce mode
each client is inserting 1 million rows (default), about 1GB size (1000 bytes per row).  So total data size is 5 GB (5 x 1)
typically there will be 10 maps per client.  So we will see 50 (5 x 10) map tasks
you can watch the progress on the console and also at task tracker UI (http://task_tracker:50030).

Once this test is complete, it will print out summaries:

... <output clipped>
....
Hbase Performance Evaluation
     Row count=5242850
     Elapsed Time in millisconds = 1789049
.....

real    3m21.829s
user    0m2.944s
sys     0m0.232s

I actually liked to look at elapsed REAL time (that I measure using unix 'time' command).  Then do this calculation:

5 million rows = 5242850
total time = 3m 21 sec = 201secs

write throughput
= 5242850 rows   /  201 seconds  = 26083.8  rows / sec
= 5 GB data / 201 seconds  = 5 * 1000 M bytes /  201 sec = 24.87 MB / sec
insert time = 201 seconds / 5242850 rows = 0.038 ms / row


This should give you a good idea of the cluster throughput.


Now, lets do a READ benchmark

# time hbase org.apache.hadoop.hbase.PerformanceEvaluation  randomRead 5

and you can calculate read throughput

B) YCSB


YCSB is a performance testing tool released by Yahoo.  It has a HBase mode that we will use:

First, read an exellent tutorial by George Lars on using YCSB with Hbase.
And follow his instructions setting up hbase and YCSB. ( I won't repeat it here)


YCSB ships with a few 'work loads'.  I am going to run  'workloada'  - it is a mix of read and write (50%  / 50%)

step 1)  setting up work load:
java -cp build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=family -p recordcount=10000000  -threads 10 -s > load.dat
-load : we are loading the data
-P workloads/workloada : we are using workloada
-p recordcount=100000000   : 10 million rows
-threads 10 : use 10 threads to parallelize inserts
-s  : print progress on stederr (console) every 10 secs
> load.dat :   save the data into this file

examine the file 'load.dat'.  Here are the first few lines:

YCSB Client 0.1
Command line: -load -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=family -p recordcount=10000000 -threads 10 -s
[OVERALL], RunTime(ms), 786364.0
[OVERALL], Throughput(ops/sec), 12716.757125199018
[INSERT], Operations, 10000000
[INSERT], AverageLatency(ms), 0.5551727
[INSERT], MinLatency(ms), 0
[INSERT], MaxLatency(ms), 34580
[INSERT], 95thPercentileLatency(ms), 0
[INSERT], 99thPercentileLatency(ms), 1
[INSERT], Return=0, 10000000
[INSERT], 0, 9897989
[INSERT], 1, 99298

I have highlighted the important numbers in bold.  One interesting stat is how many ops were performed each second.  Also you can see the runtime in ms (~786 secs)


Step 2) running the workload
The previous step setup the workload.  Now lets run it.

java -cp build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=cf -p operationcount=1000000 -s -threads 10 > a.dat

Differences are:
-t : for transaction mode  (read/write)
operationcount : specifies how many ops to try
now lets examine a.dat:

YCSB Client 0.1
Command line: -t -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=family -p operationcount=10000000 -threads 10 -s
[OVERALL], RunTime(ms), 2060800.0
[OVERALL], Throughput(ops/sec), 4852.484472049689
[UPDATE], Operations, 5002015
[UPDATE], AverageLatency(ms), 0.6575520065413638
[UPDATE], MinLatency(ms), 0
[UPDATE], MaxLatency(ms), 28364
[UPDATE], 95thPercentileLatency(ms), 0
[UPDATE], 99thPercentileLatency(ms), 0
[UPDATE], Return=0, 5002015
[UPDATE], 0, 4986514
[UPDATE], 1, 15075
[UPDATE], 2, 0
[UPDATE], 3, 2
....
....[snip]
....
[READ], Operations, 4997985
[READ], AverageLatency(ms), 3.3133978993534394
[READ], MinLatency(ms), 0
[READ], MaxLatency(ms), 2868
[READ], 95thPercentileLatency(ms), 13
[READ], 99thPercentileLatency(ms), 24
[READ], Return=0, 4997985
[READ], 0, 333453
[READ], 1, 1866771
[READ], 2, 1197919

Here is how to read it:

Overall details are printed on top then UPDATE stats are shown And  lots lines of percentiles for UPDATE follows scroll down more (or search for READ), to find READ stats we can see the avg latency is 3.13 ms
The percentiles are interesting too.  We can satisfy 95% requests in 13 ms.   Pretty good.  Almost as fast as a RDBMS

Comments

Popular posts from this blog

Python and Parquet Performance

In Pandas, PyArrow, fastparquet, AWS Data Wrangler, PySpark and Dask. This post outlines how to use all common Python libraries to read and write Parquet format while taking advantage of  columnar storage ,  columnar compression  and  data partitioning . Used together, these three optimizations can dramatically accelerate I/O for your Python applications compared to CSV, JSON, HDF or other row-based formats. Parquet makes applications possible that are simply impossible using a text format like JSON or CSV. Introduction I have recently gotten more familiar with how to work with  Parquet  datasets across the six major tools used to read and write from Parquet in the Python ecosystem:  Pandas ,  PyArrow ,  fastparquet ,  AWS Data Wrangler ,  PySpark  and  Dask . My work of late in algorithmic trading involves switching between these tools a lot and as I said I often mix up the APIs. I use Pandas and PyArrow for in-RAM comput...

Kubernetes Configuration Provider to load data from Secrets and Config Maps

Using Kubernetes Configuration Provider to load data from Secrets and Config Maps When running Apache Kafka on Kubernetes, you will sooner or later probably need to use Config Maps or Secrets. Either to store something in them, or load them into your Kafka configuration. That is true regardless of whether you use Strimzi to manage your Apache Kafka cluster or something else. Kubernetes has its own way of using Secrets and Config Maps from Pods. But they might not be always sufficient. That is why in Strimzi, we created Kubernetes Configuration Provider for Apache Kafka which we will introduce in this blog post. Usually, when you need to use data from a Config Map or Secret in your Pod, you will either mount it as volume or map it to an environment variable. Both methods are configured in the spec section or the Pod resource or in the spec.template.spec section when using higher level resources such as Deployments or StatefulSets. When mounted as a volume, the contents of the Secr...

Andriod Bug

A bug that steals cash by racking up charges from sending premium rate text messages has been found in Google Play.  Security researchers have identified 32 apps on Google Play that harbour the bug called BadNews. A security firm Lookout, which uncovered BadNews, said that the malicious program lays dormant on handsets for weeks to escape detection.  The malware targeted Android owners in Russia, Ukraine, Belarus and other countries in eastern Europe. 32 apps were available through four separate developer accounts on Google Play. Google has now suspended those accounts and it has pulled all the affected apps from Google Play, it added. Half of the 32 apps seeded with BadNews are Russian and the version of AlphaSMS it installed is tuned to use premium rate numbers in Russia, Ukraine, Belarus, Armenia and Kazakhstan.