Monday, August 4, 2014

What is HandlerSocket? And why would you use it? Part 1

HandlerSocket is included with MariaDB and acts like a simple NoSQL interface to InnoDB, XtraDB and Spider and I will describe it a bit more in this and a few upcoming blogs.

So, what is HandlerSocket? Adam Donnison wrote a great blog on how to get started with it, but if you are developing MariaDB applications using C, C++, PHP or Java what good does HandlerSocket do you?

HandlerSocket in itself is a MariaDB plugin, of a type that is not that common as is is a daemon plugin. Adam shows in his blog how to enable it and install it, so I will not cover that here. Instead I will describe what it does, and doesn't do.

A daemon plugin is a process that runs "inside" the MariaDB. A daemon plugin can implement anything really, as long as it is relevant to MariaDB. One daemon plugin is for examples the Job Queue Daemon and another then is HandlerSocket. A MariaDB Plugin has access to the MariaDB internals, so there is a lot of things that can be implemented in a daemon plugin, even if it is a reasonable simple concept.

Inside MariaDB there is an internal "handler" API which is used as an interface to the MariaDB Storage Engines, although not all engines supports this interface. MariaDB then has a means to "bypass" the SQL layer and access the Handler interface directly, and this is done by using the HANDLER commands with MariaDB. Note that we are not talking about a different API to uise the handler commands, instead they are executed using the same SQL interface as you are used it's, it's just that you use a different set of commands. One limitation of these commands is that they only allow reads, even though the internal Handler interface supports writes as well as reads.

When you use the HANDLER commands, you have to know not only the tables name that you access, but also the index you will be using when you access that table (assuming then you want to use an index, but you probably do). A simple example of using the HANDLER commands follows here:

# Open the orders table, using the alias o
HANDLER orders OPEN AS o;

# Read an order record using the primary key.
HANDLER o READ `PRIMARY` = (156);

# Close the order table.
HANDLER o CLOSE;

In the above example, I guess this is just overcomplication something basically simple, as all this does is the same as
SELECT * FROM orders WHERE id = 156;

The advantage of using the handler interface though is performance, for large datasets using the Handler interface is much faster.

All this brings up three questions:
  • First, if we are bypassing the SQL layer, by using the HANDLER Commands, would it not be faster to bypass the SQL level protocol altogether, and just use a much simple protocol?
  • Secondly, while we are at it, why don't we allow writes, as this is the biggest issues, we can always speed reads by scaling out anyway?
  • And last, how much faster is this, really?
The answer to the first two questions then is the Handler Socket plugin, which was the whole deal with this blog, as this use a separate, very simple, protocol and allows writes! For the third question, this is where I come in, I have done some simple INSERT style benchmarks using HandlerSocket, and I have some results for you in my next blog. So don't touch that dial, I'll be right back!

/Karlsson

No comments: