The Uncertainty Principle

Back in 2010, George James and I co-authored a paper that positioned the Mumps database as a Universal NoSQL database.   That paper seems to have become something of a classic, and actually provided much of the inspiration for this blog and the work I do behind the scenes.

Interestingly, the tag-line “Universal NoSQL” has been jumped on by a relatively new entrant to the burgeoning NoSQL database marketplace: ArangoDB.  I suppose imitation is the sincerest form of flattery (George and I adopted that tag-line first guys!).  Actually the ironic part about it is that if you go looking for their database on Google using that strap-line, our paper comes up at the top of the list and people who probably would never otherwise have heard of Mumps databases get the chance to learn about it from our paper.  So perhaps we should say: “thanks Arango!”

When considering databases, people love to be able to categorise them.  It used to be simple: your database was either relational or not, and if not, nobody wanted to use it because relational was king.  Of course, before relational acquired that dominance, we had many other models, including hierarchical and network.  When NoSQL pushed the relational dogma aside, we suddenly found ourselves awash with database categories, including:

  • Relational (don’t worry, it’s still there!)
  • Key/Value
  • Columnar
  • Graph
  • Document
  • Object
  • Native XML

Most databases, particularly the NoSQL ones, are “one-trick ponies”: they adopt one of these models and are optimised for that model alone.  Until George and I wrote our paper, I used to find it difficult to answer the question: “so what type of database is Mumps?”.  The answer turned out to be “any or all of the above”!  Hence the Universal NoSQL database: indeed that should be Universal NOSQL database (where NOSQL means Not Only SQL).  InterSystems’ Caché demonstrates this, as they position it as an Object/Relational database, but of course under the covers is a good old Mumps database engine, so they’re perhaps doing themselves a disservice by narrowing their categorisation in this way.

There’s a reason why this multi-facetted nature of Mumps databases is important: in the real world, you need to be able to work with your data in different ways.  Organisation that have huge relational databases for their main corporate information management often end up copying snapshots of that data into a “data warehouse”: a database designed for slicing and dicing that data in ways that would be impractically slow on their relational database.  I’m starting to see a new trend of companies introducing NoSQL databases such as MongoDB to sit side by side their corporate relational model, so they can view their data as JSON documents.  All well and good, and of course these NoSQL databases often have the advantage of being free, open source products, so they don’t cost anything, do they?  Well…actually they do, and it can be very costly to create such a fragmented “best of DB breed” strategy:

  • the most expensive part of your data processing environment is your people.  So now you need people who understand, in technical detail, not only your relational database, but also your new, shiny NoSQL ones.  Understanding and learning databases when they work and behave themselves is pretty easy.  However, those technical database guys really earn their money when the shit hits the fan and everything goes horribly wrong.  That’s when detailed expertise, usually gained the hard way, is absolutely vital.  If your database guys only know the basics of that NoSQL database, what are they to do?  That free, open source database could quickly become a very expensive and critical mistake.
  • those people also have a habit of leaving.  That guy who you depended on to fix the problems when the relational database got out of synch with the NoSQL one just walked out the door.  Where are you going to get another one in a hurry?  Database guys tend to know just one database in detail.
  • don’t underestimate the complexity of the processes needed to be put in place to synchronise that mix of databases.  Once again, those processes are pretty straightforward when it all works, but when something inevitably trips things up, getting things back to normal can be a devil of a job and can put you out of action for hours or days.

So what’s the alternative?  How about a database technology that has been around for decades (and is therefore tried and tested in the harshest of environments) and that can behave, simultaneously, in different ways.  A database that performs as well, if not better, than any others out there, can scale to huge levels and has built-in synchronisation mechanisms that, again, are battle-hardened.  One technology, one set of skills for your database guys to learn.  One set of skills to retain and recruit.

In the world of databases, new, shiny and sexy appears to be the current trend.  In my opinion it’s a dangerous thing: the database technology you bet your business on needs to be tried and tested, and therefore new is not a very sensible bet.  It’s the one part of the IT landscape where old should be seen as a good and important thing!

One final thought: in writing these blog articles, particularly the “Phoenix” series, I began to realise that a Mumps database actually has a further interesting characteristic, and one that is perhaps unique in database-land.  It’s the fact that the category of database depends on how you want to consider your data.  Indeed, it’s possible to view the same data as different models or views.  For example, data you stored as a JSON document can be viewed as a graph, a set of key/value pairs, or perhaps as a hierarchical database.  InterSystems have done this for years now with Caché : define and store your data as persistent objects, yet view and analyse it using SQL as if it’s a relational database.

It came to my warped mind that there’s an interesting analogy with quantum physics where photons can appear to be either waves or particles: how they appear depends on whether you’re testing to see if they are waves or particles: they’ll appear to be what you’re expecting or testing them to be.

So welcome to Heisenberg’s database!  Just don’t consider putting Mumps developers in a box…

 

 

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 51 other followers

%d bloggers like this: