Friday, May 27, 2011

And on a completely different note.. Potatoes..

At MySQL I think I once instituted something which became sort of a tradition. If you are not a current or former MySQLer, the rest of this post may not be too interesting, and possibly it's not that interestying in the general sense either. But as I no longer work for MySQL, and many other MySQLers who remember this are also former MySQLers today, I use this medium, my blog, for this years review of the #1 swedish delicatessen, Swedish New Potatoes. And yes, I am aware that this has little to do with the MySQL product, although it does have something to do with the culture of the old MySQL (i.e. you could email company-wide about potatoes and still keep your job).

So, what about this years new potatoes? They are kinda early this year, it's not even June yet, and I had my first last night. Newly, and lightly, boiled, eaten with some butter and a pinch of salt. The verdict? In potato-technical terms, the term is Yummy! This exquisite little vegetable really made my day yeasterday!

So, yes, Potatotes are still good, even at Recorded Future. And by the way, I hope to keep my job here, despite the potatoes...

/Karlsson

Thursday, May 26, 2011

Annoyances, annoyances. Or Yet another HBF (Half Baked Feature)

About any product, be it computer hardware, software or any other product, has features that are annoying to some of us. But few products has so many features that are annoying to just about everyone as computer software. And among computer software, database software in particular seems to to have these features, which some people seems to like, and some just find annoying. And then there are features, or lack of them or implementation specific details that seems to annoy just about everyone. Things that work in a partuicular way because someone, somewhere, in some distant universe, had the notion that this was a good thing. Often features relating back to ancient times. And sometimes features that you just know work in a weirdo way because the person, if it was a person, figuring out the feature of the implementation of it, really must have been smoking something that is illegal in many parts of the world. Which is not to say that these features sometimes have a use or that the lack of a feature doesn't have a reason (like "It seemed like a good idea at the time").

And for some reason, MySQL seems to be hit with these things more than most software and database software products. And although I like working with MySQL, having done so for many years by now, there are certain aspects of it that just annoys the h*ll out of me. Things such as:
  • We have feature A, yes. And we we also support feature B. Great. But you cannot use feature A and B at the same time.
  • Oh, yeah, we support feature C, but really it is so slow that we recommend against using it.
  • Sure we support feature D, but the implementation really sucks.
  • Oh, storage engine E is sure supported, but the quality is so bad that we have disabled it completely.
  • On platform F, we don't really support feature G, as there is a limitation H on that platform. A limitation that was removed from that platform sometime during the Nixon administration, but all the same.
  • Feature I has been replaced by feature J, so using I will not be supported in the near future, but J still lacks the full functionality of I.
  • Oh, yes, you can sure do that! You can use either feature K or feature L, which have overlapping functionality and none is really feature complete in and of itself.
  • Our development team is hard at work at A) NOT getting rid of these annoyances and B) adding more of them.
So now I have another annoyance (I guess you could see that coming). Again, a new cool useful feature bogged down by someone, somewhere, in some universe wasn't thinking. InnoDB (a great Storage Engine BTW) has from version 5,5 of MySQL (and for longer than that in the plugin) been able to drop an index without rebuilding the whole table. Or rather, this is not a cool new feature unless you look at nothing else than MySQL, this feature was way WAAAY overdue. InnoDB also supports another neat feature, which is FOREIGN KEYS. You may say that FKs are no good for performance, and maybe that is so, but they are useful for, at a declarative level, ensure data consistency and they also add a level of "documentation". The implementation of FKs assumes that you have an index on the referencing column (why I do not know. There are cases when you don't want this.), but the rest is just in the declaration of the table and in the checks for FK consistency when issuing DML on the FK referencing and referenced tables or cource. So, if we now assume that in terms of data base data related to the table in question, the only thing that is added when we add an FK is an index, a fully normal index, then you should be able to drop the FK without rebuilding the table, as we can do that with the index? Right?

Nope. No cigar. Dropping an index is nearly instantaneous and doesn't require rebuilding the table, dropping the FK does. (I just tried this in 5.6, just to make a point. Same thing there). But have I not forgotten something, should I not be able to issue a:
SET FOREIGN_KEY_CHECKS=0;
And then drop the FK and then the index? Nope, that actually seems to make things worse and seems to confuse MySQL a lot. If you do this, you may drop the index, without the FK, and in a sense things now works, as although the FK is there, it isn't checked! Even if you set FOREIGN_KEY_CHECKS back to 1! This really makes no sense at all to me!

Playing with FOREIGN_KEY_CHECKS and droping FKs and indexes in MySQL 5.6 seems to confuse the h*ll out of MySQL also. I might one day get down to create a reproducible case here.

Anyway, why can't you drop an FK without rewriting the table? When the index can be dropped without rewriting the table? And don't tell me not to use FKs, if a feature is implemented it really should work, else remove it. And if I am not to use FKs with MySQL (in almost any other RDBMS they work just fine), then why were they implemented? No, FKs are useful, not always and sometimes they are too limiting for performance, but in many cases, they are all OK and really should work.

As Thomas Watson of IBM fame put it: THINK!

/Karlsson

Wednesday, May 25, 2011

MySQL Database cleaner 1.0 released

If you ever, and I think many of use DBAs have, been in the situation where you are stuck with data in the database that isn't used and isn't accessed, data which may consist of rows that are no longer used, data rows that aren't references, because you don't use FOREIGN KEYs or they weren't applicable in this case. Or data that was once used, but no longer is.

And in many cases, this data is tucked in among your other good rows of data :-( One way of cleaning up the database in a case like this is to run standard DELETE statements, but there are a few issues with this:
  • You may be accessing a lot of data, so this may take a while.
  • You will be locking large amount of data for this.
  • The join statement to get the data that is no longer used and / or no longer referenced is complex.
  • There is no really good way to split this DELETE in smaller chunks, except using LIMIT, but if what takes a long time to execute if the complex join to find offending rows, this is not always useful.
What is needed is a cleaner. A program that can run a simple or complex SELECT statement to find identifiers of the data you want to delete, and as this is a SELECT, we will have less issues with long running transactions and database locking. And the the IDs gotten from this SELECT is used to DELETE the rows we want to delete, based on the PRIMARY KEY to make it fast and to minimize the time we lock things.
In addition, we would prefer the DELETE to run in multiple threads and for the program to be very configurable. Well, here is the fix for you: mycleaner. MyCleaner is a program that does just this. It is not very complex, but useful.

Before I point you to the download location, I have one thing that is on the TODO list but which is not yet fixed: MyCleaner only works with integer numeric keys (INT, SMALLINT, BIGINT etc). If you are find with this and have a need to clean your data, give MyCleaner a try, it 's GPL Open Source and is ready for download from sourceforge here.

Ideas for improvement, code contributes etc are welcome. And yes, there is also documentation, not the most comprehensive of documentation, but it is there.

/Karlsson

Monday, May 9, 2011

Slavereadahead 1.3 available

Version 1.3 of Slave read-Ahead is available for download. If you don't know what this tool is about, it is a tool that runs in the background, reads the incoming replication log on a slave and converts INSERT, UPDATE and INSERT ... SELECT statements into SELECT statements and executes these before the statement in question is executed on the server, the idea being that this will "prewarm" the MySQL caches for this date, for example the rows that an UPDATE is affecting will already be in the cache when the UPDATE arrives on the slave. Because of the way replication data is read, this tool only works with MySQL5.5 and up.

Version 1.3 introduces the auto-reconnect feature. This will reconnect to the MySQL server if the connection fails. To be sure that we restart, all existing connectings are released before a reconnect attempt is made. This version also has some more log printouts, mainly operations per second.

In addition, Version 1.3 fixes a few minor bugs, and one semi-major one, which caued not all INSERT ... SELECT statements to be handled properly.

/Karlsson