2010/04/06 00:20



Making Kyoto Cabinet Ruby binding work in parallel   


Kyoto CabinetのRubyバインディングの使い方については以前の記事で説明してあるが、今回はそこで触れていた並列化をやってみたという話。


I've explained how to use KC Ruby binding at the previous article.  This time, I will focus how I made the binding to work in parallel.


Ruby 1.9のスレッドモデル


Ruby 1.9のマニュアルには以下のように書いてある。

  •  ネイティブスレッドを用いて実装されていますが、現在の実装では Ruby VM Giant VM lock (GVL) を有しており、同時に実行されるネイティブスレッドは常にひとつです。ただし、IO 関連のブロックする可能性があるシステムコールを行う場合には GVL を解放します。その場合にはスレッドは同時に実行され得ます。また拡張ライブラリから GVL を操作できるので、複数のスレッドを同時に実行するような拡張ライブラリは作成可能です。


Threading model in Ruby 1.9


Ruby 1.9 manual says like this.


"Ruby 1.9 is implemented to use native threads. However, the current implementation has Giant VM lock (GVL) and it limits the number of native threads which can be run at one time to one. However, It can release GVL when Ruby calls blocking IO related system calls. In such situations, threads can be run more than one at a time. You can also operate GVL via extension libraries, so you can create libraries which run in multi threads."




This means that I can run Ruby codes and DBM operations in parallel if I execute DBM operation after I release GVL. Since KC's strong point is its parallel operation, that's the way to go.





There is actually no documentation about how to use GVL from C extensions.  

Luckily, someone told me how to do it when I tweeted about it. It is written in Ruby's thread.c source file as comments.

It seems that reading the Ruby source code is a way easier to find information than googling. You should first check the list of public  methods at ruby/intern.h , then jump to the detail of the method you are interested in, though you can guess most of the implementations by looking at names and signatures.




関数の名前からすると「スレッドをブロックしながら何かするコードの領域」なのかなと一瞬思ったりもするが、コメントにも「permit concurrent/parallel execution」に書いてあることから、たぶん「スレッドをブロックさせてしまうようなコードはこの領域の中で実行してね」という意味なんだと思う。シグネチャがちょっと複雑で、以下のようになっている。


The method name implies that "it is a area to do something while blocking thread", but that's not correct. As the comment says "permit concurrent/parallel execution", it probably means that "execute any thread blocking operations here". The signature is a bit complicating like this.




  rb_blocking_function_t *func, void *data1,

  rb_unblock_function_t *ubf, void *data2);


GVLを外して実行したい関数へのポインタをfuncに指定し、その関数に渡したいデータのポインタをdata1に指定する。ubfとdata2はよくわからないのだが、とりあえずRUBY_UBF_IOとNULLを渡してあげればいいっぽい。コメントに「In short, this API is difficult to use safely.」とあるのでビビるわけだが、簡単なユースケースなら何とかなりそうだ。


Specify the pointer of the function you want to execute without GVL to "func", and the pointer of the data you want to pass to the function to "data1". I wasn't quite sure what "ubf" and "data2", but it worked fine when I specified RUBY_UFB_IO and NULL. The source comment says "In short, this API is difficult to use safely", so I was a bit worried, but looks like I can use it for simple use cases.









To use this method, I have to change codes at binding to be executed inside the rb_thread_blocking_region function. I abstracted this for portability purpose like below.


class NativeFunction {


  virtual void operate() = 0;

  static void execute(NativeFunction* func) {

#if defined(_KC_YARV_)

    rb_thread_blocking_region(execute_impl, func, RUBY_UBF_IO, NULL);






  static VALUE execute_impl(void* ptr) {

    NativeFunction* func = (NativeFunction*)ptr;


    return Qnil;





This will let you execute the code inside rb_thread_blocking_region by passing a class called NativeFunction which has "operate" method and works like functor.

There is also "operate" method which does not use rb_thread_blocking_region so that this  work with both Ruby 1.8 and 1.9

The below is how to use this NativeFunction


bool db_remove_record(DB* db, const char* kbuf, size_t ksiz) {

  class FuncImpl : public NativeFunction {


    FuncImpl(kc::PolyDB* db, const char* kbuf, size_t ksiz) :

      db_(db), kbuf_(kbuf), ksiz_(ksiz), rv_(false) {}

    bool rv() {

      return rv_;



    void operate() {

      rv_ = db_->remove(kbuf_, ksiz_);


    kc::PolyDB* db_;

    const char* kbuf_;

    size_t ksiz_;

    bool rv_;

  } func(db, kbuf, ksiz);


  return func.rv();





上記の方法でひたすら既存コードを書き換えて、一通りの機能を実装した。で、Core2 DuoのマシンのRuby 1.9.1上でベンチマークテストを行ってみた。4スレッドで100万件のレコードの操作を行ったところ、以下の結果となった。


Benchmarking result.

I implemented all the functionalities and did performance test. The conditions are as follows.

* Machine = Core2 Duo

* Ruby version = 1.9.1

* Number of threads = 4

* Number of records = 1 million

















Ahhh, the parallel version is slower than serialised version. I don't know the exact reason, but it is more likely that inserting 1 million is not enough to stress test KC. It is rather slowing down due to additional overhead of creating many threads.




This is quite sad. when data is small enough to put everything in cache, DB operates too fast to be the bottleneck and you can see the overhead at Ruby level. You might think it gets faster when data is too big to put into cache, but that's not the case either, because HDD has only single IO and can't perform in parallel. This means that you can only benefit from KC's parallelisation if you use SSD.




To sum up, you can not benefit a lot from parallelising DB layer when you can only do serial operation with Ruby and this is the case for most users, though it could be beneficial for some people who have better hardware (i.e.. SSD). I need to investigate more once I get proper SSD environment.






How to use parallel mode and some gotchas


Ruby binding uses serialise mode by default. This means that KC native API is called directly, not via rb_thread_blocking_region. This will avoid some overhead when you do not use threads (which is majority of the cases).

If you want to use in parallel mode, you have to add "true" as argument at DB::new 


db1 = DB::new        # Serial mode

db2 = DB::new(true)  # Parallel mode




new以外の使い方は全く同じでOKである。ただし、DB#acceptとDB#iterateとDB#eachとCursor#acceptは並列モードにすると利用できなくなってしまう。それらの共通点は、RubyコードをKCのネイティブAPIの中からコールバックするということだ。なぜRubyコードを呼び出せないかというと、GVLを外した状態で実行するコードではRuby VMを決して操作してはならないことになっているからだ。まあ、acceptがなくてもcasだけあれば何でもできるさ。


Other than this, it's mostly the same how you use the binding. One thing you have to be aware that certain commands (DB#accept, DB#iterate, DB#each, and Cursor#accept) can not be used with parallel mode. This is because all of these require  callback from KC native API to Ruby code. Since you can not operate Ruby VM while you get rid of GVL, this is not acceptable operation. Having said that,  the use of these methods can be replaced by using "cas" method.





Kyoto CabinetのRubyバインディングの最新版からは、並列モードが使えるようになった。普通は直列処理(というか、並列でない並行処理)しかできないRubyプログラムであるが、DBなどのIO系のネイティブな処理は並列化して実行できる。KCがボトルネックになるような大規模なDBを構築してかつSSDのような並列処理性能の高いデバイスを用いている場合には、並列モードがきっと役立つと思う。そんなこんなで、ROMAとかで使ってくれないかなぁとここで呟いてみる。




You can use parallel mode via KC Ruby binding. In majority of the cases, you can only do serial operation (i.e.: concurrent operation rather than parallel operation), but you can parallelise native IO operations when accessing DB. If you create very large scale database where KC could become bottleneck, then you may be able to achieve hight throughput with Ruby's parallel mode and some highly parallelised devices (such as SSD). I am kind of hinting that ROMA(http://github.com/roma/roma/) could be the perfect use case. I am just saying…


