In general, I often feel that many Open Source projects start with good intentions for what the project should do and how, but then more stuff is added as you require it, and suddenly what started out as a simple and fast application for a narrow usecase, suddenly turns into a bit of a mess. And the issue might well be that building fast, compact software for a specialized usecase, as they start out, is not the same as writing generic software, with a wide range of use cases, code that can easily be maintained and enhanced as we go along. And why should it not be like that? In many cases, this is just fine and the limited usecase is just what the project sets out to do, and it does it well. But sometimes this turns into something really annoying, and at the same time useful.
I think MySQL is partly like this. There are many things in MySQL that work really well, so many things that in a small set of code achives so much useful stuff. And then there are things with MySQL that are just outright wrong and yet another bunch of things that are just plainly annoying. Largely, MySQL is very much developer focused more than DBA focused, although this is improving (and this is also a personal opinion of mine).
And then we have MongoDB, one of the big contenders on the NoSQL side in the NoSQL vs. SQL battle (which is a silly battle, but lets ignore that for now). Now, MongoDB is supposed to be a database. One that is faster, more compact and more targeted towards general database needs than MySQL. A database that can scale and replicate and shards automatically! Brilliant. And then this boring old DBA comes around with his bitterness and boredom and ignorance of the "new" database system. And he says things like: "Can you do a backup"... Yikes! Never thought of that. And what boooring guy that DBA is!
Yes, with MongoDB, backups is clearly an afterthought. And although this is again my personal opinion, I base it on something, namely this: Backing up a sharded Cluster. This is just plain silly. The way these steps are taken is in no way consistent, some of the operations are asynchronous, which means you have to wait for them (write code. To do a friggin' backup? Who came up with THAT daft idea?). You have to backup config servers cold, i.e. shut down. Dead. Who came up with that? And in the end, you don't even get a consistent backup. And yes, if you use replicas, you have to physically back up the replicas also! What?
Whoever figured out how to shard and replicate with MongoDB did a reasonable job, it actually works OK. But the person in question apparently forgot that databases are to be backed up. And before you ask: No, I am NOT going to take a mongodump of 1.5 Tb data!
This said, most aspects of MongoDB are OK, but backups are a mess. Read the page I linked to above, and you also realize that it is not well documented, to say the least, how to backup and what happens if the steps aren't followed? What happens if I do not backup the stopped config server? Can I do a mongodump of the config server instead? Why in heavens name can't I:
- Flush and lock the config servers?
- Flush all the dataservers in one go? No, you have to do it in one dataserver at the time.
- Flush and lock the config servers?
- And yes, why can't you flush and lock the config servers.
And having written all this, I have now created a script that will do all this for us, in unattended fashion. Yes, backups are supposed to be able to run unattended! No, I do not want any manual checking in the midst of a backup process! I do not want to be up at 3AM!
And Yes! I want a way to verify my backups, with some ease! No, I'm not going to set up a cluster of a 8 nodes with 4 Tb disk, just to verify a backup! And this is usually much less of a problem with MySQL, as it is not sharded / distributed. But for anything that IS sharded / distributed, for heavens sake, make sure there are tools to support this. In particular Backup tools!