I have seen the future and it’s Node.js shaped

Perhaps it’s because I’ve been around IT for quite a while: since the late 1970s in fact.  Perhaps I’m just jaded and cynical.  I’ve watched so many technologies come and go, and I’ve watched IT trends circling in and out of fashion.  I’m sorry to say that most times when the “next big thing” coming along, I watch the new guys first really excited about it, watch it progress to either dying a death or becoming “mainstream” and therefore “safe”, at which latter point the management consultants start recommending it to their customers.  Normally, throughout the process, I feel completely underwhelmed by the whole thing.  What they’re all raving about is normally, in my opinion, nothing to get excited about.

Suffice to say, I’ve never been interested in technologies as fads or fashions, or as a means of furthering my career in IT.  I’ve always been interested in technologies that are capable of doing what they really should the there for: allowing me and others to achieve more with less effort.  Just like my colleague back in Hull University who first got me into programming in the first place used to say to me: at the end of the day I’m lazy.  I really don’t want to be laboriously doing stuff that the damned computer can be doing for me instead – that’s its job!  Yet, so often, I see technologies that require too much effort, require me to instruct the computer about what I want to do in far too minute detail, require me to state essentially the same thing over and over again in a ludicrously redundant way.  Too many steps involved, too much work.  I like instant gratification.  I’m impatient and lazy.  If there’s one thing I hate more than anything else, it’s writing more code than I think I should need to write.

Every now and again, though, along comes a technology that hits me between the eyes.  I give it a try and very soon a smile appears on my face and/or I mutter to myself: “Holy s***t!”.  “Getting” Mumps way back in the early 1980s was probably my first experience of this.  Seeing that first web page coming up in a flaky Cello browser (http://en.wikipedia.org/wiki/Cello_(web_browser)) on my home PC in about 1993 was the second, followed by jumping onto Javascript when it was first released by Netscape around 1994/5.

Lots of tedious, underwhelming technologies have gone under my bridge since then, and it wasn’t until about 2 years ago that I had another “Holy s***t” moment.  I’d been watching a video of Douglas Crockford’s (http://en.wikipedia.org/wiki/Douglas_Crockford) latest video on Javascript and he started talking about something called Node.js.  I’d seen this technology being mentioned and talked about here and there but hadn’t spent any time investigating it.  However, here was the main man, the Javascript guru, getting excited about it.  Time, therefore, to investigate, and – wham! – I’m hooked!

I shalln’t dwell here on what Node.js is.  A quick Google search will take you to all the information you need on it.  Suffice to say it’s a server-side implementation of Javascript.  Basically they’ve taken Javascript, whose natural home was previously the browser, and moved it server-side, to create a programming language ideally suited to network programming.  It’s event driven, all I/O (and resource handling generally) is asynchronous, and, courtesy of Google’s V8 Javascript engine, it’s one of the fastest, most scalable technologies I’ve come across since Caché and GT.M (eg see http://highscalability.com/blog/2012/10/4/linkedin-moved-from-rails-to-node-27-servers-cut-and-up-to-2.html).

A key reason for my excitement about Node.js was the timing.  Taken in isolation, there’s maybe nothing too exciting, per se, I suppose, about Node.js.  But my discovery of Node.js coincided with the peak of the early interest in the NoSQL movement.

In my opinion, the NoSQL movement was the best thing that could have happened and it was long overdue.  For years – since the early 1980s in fact – “experts” in IT had deemed that the only database model worth giving any consideration to at all was the relational model.  Anything else was looked down on as some antiquated and worthless idea that should have been confined to the IT dustbin of history years ago.  This “mainstream” thinking, based more on fad, fashion and industry inertia than on technical merit, was one of the factors that kept the Mumps database technology in the shadows for several decades.  Indeed, it’s a sad fact that many’s a great and highly efficient and effective Mumps-based application has been expensively ripped out and replaced by an underwhelming half-baked “mainstream” technology as a result of “safe” recommendations of a management consultant: “well there’s your problem – that old Mumps stuff.  You need a proper database, mate.  I’m a consultant guru who specialises in databases, and I’ve never even heard of Mumps, so go figure!”

So the NoSQL movement that burst onto the scene in 2009 came as a breath of fresh air.  Suddenly the relational database was no longer king.  Suddenly other database models were in vogue: key value, document, columnar, graph. An lo and behold, those same management consultants who, years earlier, had been responsible for chucking out Mumps databases, were starting to recommend to their customers that they really needed to be looking at NoSQL databases.  The irony was, of course, that the Mumps database they’d been recommending to be replaced a few years earlier was, in fact, a truly great NoSQL database that had pre-dated the NoSQL movement by several decades (http://www.mgateway.com/docs/universalNoSQL.pdf).

So I saw the growing wave of interest in NoSQL  as a great way of bringing the Mumps database – whose primary modern implementations are Caché and GT.M – back into vogue.  There was a major stumbling block, however.  Pretty much every other database, the new NoSQL ones included, are just databases, and are accessed via most modern computer languages.  Mumps, however, has its own integrated language that is used to access and manipulate the database.  Unfortunately, as I’ve discussed elsewhere in this blog (A Case of Mumps), mention the word “Mumps database” and very soon someone would poison the idea by pointing to numerous articles rubbishing the Mumps language.  No matter how good or powerful the database, it was being subverted by opinion about its associated language.

Many who know Caché and GT.M will, of course, be thinking: but it’s been possible for ages to access the underlying Mumps database from mainstream languages.  That’s true, and, for example, our own company has created numerous language bindings for GT.M as part of our MGWSI technology (eg PHP, Python, Ruby, Java, .Net) (http://gradvs1.mgateway.com/main/index.html?path=mgwsiMenu).   However, in my mind, those “mainstream” languages have never quite gelled with the Mumps database – it’s difficult for me to put my finger on it and define properly what I mean: it’s more a gut feel I have about them.

But along comes Node.js and the possibility of server-side Javascript in a massively-scalable, super-fast environment.  Javascript, it turns out, has many of the characteristics I’ve always loved about the Mumps language (see A Case of Mumps).  They are both what I describe as “fast and loose”.  I can do lots of stuff very fast with them both.  I don’t end up fighting them and they don’t fight me.  I guess it’s because they both allow me to be lazy and give me that instant gratification.

Additionally, it was clear early on that Node.js was going to become massive.  Sure enough, that’s come to pass.  A year ago, Javascript surpassed Ruby on Github as being the most popular Open Source development language: all the result of Node.js.  The trendy language for young developers to learn is now Javascript.  Microsoft have been closely involved with the development of Node.js for quite some time.  It’s available for Windows, OS X and Linux.  Put simply, Node.js is huge, and still gaining momentum.  Expect to see the management consultants recommending it as a “safe” bet any time soon.

And so we have the potential for a perfect marriage: Node.js providing Javascript as a scripting language for the universal NoSQL database.

I’ve done a lot of work in this area: I think it’s fair to say I pioneered the whole idea of integrating Node.js with Caché and GT.M.  You’ll find my work on Github (https://github.com/robtweed).  It’s proven to me that the combination is extremely potent and powerful.  Probably the coolest thing I’ve managed to achieve in this area is to have persuaded InterSystems to include a native Node.js interface to Caché .  Actually, that interface is the work of my colleague Chris Munt, but carried out on behalf of InterSystems.  InterSystems first released it as part of their  free Globals database (http://globalsdb.org/), but they’ve now made it available for their latest version of  Caché.

The really special aspects of this Node.js interface to  Caché are:

  • it makes use of a very high-performance , low-level call-in interface to  Caché’s global database engine;
  •  Caché runs in-process with Node.js, so there is no network-layer bottleneck.  It’s blindingly fast.  It’s such a close binding that Javascript becomes the scripting language for Caché;
  • Chris provided synchronous versions of the APIs which make it possible to build properly-layered high-level APIs on top of the low-level ones provided by the raw Node.js/ Caché interface.  For example, I’ve implemented a persistent XML DOM, with the DOM modelled as a graph database in global storage, and manipulated via the DOM APIs implemented in Javascript.

These things add up to something very special, and something where the whole is definitely greater than the sum of its parts.

Whilst this ideal level of close integration is currently limited to the proprietary Caché, there’s no reason why a similar in-process interface couldn’t be built for the Open Source GT.M.  I’m hoping someone will get on and do this.  In my opinion this model represents the future for the Mumps technology: a super-fast, highly-scalable, ultra-flexible NoSQL database, accessed by the hottest and coolest, most trendy language out there.  Instant gratification guaranteed.

Holy s**t!

Advertisements

16 comments

  1. Adam Simpson · · Reply

    Rob – Having had only a fairly brief look at the node.js interface to Cache, I think the one area it badly lacks [and which will probably put off the “hip kids”] is in relation to the query API offered by ‘rivals’ such as mongoose / mongoDB [e.g. Order.find({customerID: ‘1234’}).where(‘createdDate’).lt(oneYearAgo)…;].

    Are you aware of any efforts to implement something similar to the above in Cache, so that we don’t need to use the low-level $Order & $Query equivalents to traverse the globals?

    I’d also be interested in any efforts to allow “legacy” [hate that term] global data [i.e. delimited fields] be exposed/updated via node.js – perhaps using a mechanism similar to class storage maps?

  2. Adam

    I kind of agree with you and kind of don’t 🙂

    In another post (All I want for Christmas) I mentioned the need for an SQL and a Map/Reduce interface.

    The cool thing about the Cache Node.js interface is that it provides all the bottom-level primitive APIs you need to manipulate globals, allowing any higher level of access to be built up from those APIs to create higher-level modes of abstraction and database access.

    A good example is my ewdDOM project that you’ll find in my Github repo. It has been built on top of the primitive APIs to create a full set of Javascript XML DOM API functions. The persistent XML DOMs are created, stored and manipulated as a graph database that is mapped on top of global storage in Cache/GlobalsDB. Those functions are ultimately just sets/gets/$orders etc, but the user just sees an integrated persistent XML DOM in Node,js.

    So it would be equally possible for someone to write an SQL projection all in Javascript, or a Map/Reduce interface in Javascript. It’s all just sitting there waiting for someone to go out and do it. I see no reason why someone couldn’t create an object projection on top of global storage, but implemented in Javascript. In a limited XML DOM-orientated way, that’s basically what I’ve done in ewdDOM.

    If we can get the same kind of synchronous, in-process APIs available for the OpenSource GT.M, we have something that a lot of people outside the existing GT.M/Cache communities might find very interesting!

    So, I guess it’s initially up to those of us old farts who understand the primitive get/set/kill/$o/$f stuff and push up the levels of abstraction to a level that the hip kids can pick up and run with. What’s there is sufficient to make the rest possible.

    Rob

  3. I think the other thing that needs to be remembered: whilst many of the NoSQL databases have just one modus operandi and one built-in way of being accessed/queried, the Mumps database is real low-level stuff with a simple, low-level set of primitive access APIs. As the “Universal NoSQL” paper demonstrated, it’s like a basic low-level palette of NoSQL capabilities, waiting for people to build cool stuff on top. So that example you gave to access/search could, in a global-storage based system, be just one of many different ways to search all manner of different high-level database projections….all at once if you wanted.

    The Mumps database engine isn’t just some one-trick pony of a NoSQL database: it can be all manner of different kinds of projection and it’s a matter of building those projections. The cool thing about having the Node.js interface is that those projections can all be developed and written in Javascript.

    Rob

  4. Adam Simpson · · Reply

    Thanks Rob – I had considered that the projections could be built in javascript over the primitive APIs, but I suppose my concern was how this would impact performance compared to a ‘native’ solution (e.g. how does the speed of 100,000 ‘mydata.next’ iterations in node compare to the equivalent number of ‘$Order’ calls in Cache?).

    I haven’t actually used it yet, so haven’t been able to do any benchmarking (nor have I found any info on this).

    Adam

  5. Good question: and here’s the really cool thing. The Cache/Node.js interface is in-process and not over some network/inter-process connection, and the Cache call-in interface goes straight to the low-level part of the global engine (it’s the same interface used for their “Java Extreme” interface). So there is really very little overhead using Javascript compared with Mumps code to access global storage.

    Remember, too, that for Cache, you also have access to an API that allows you to invoke a Mumps function. If you really did find a performance issue, then you can always hand off the work to a Mumps function to do the heavy lifting. My newest version of the EWD gateway for Node.js does exactly this. Eventually I’d like to re-implement EWD in Javascript, but for now I call directly into the EWD run-time engine by invoking a wrapper Mumps function from the Node.js side – this variant of EWD writes its output to a temporary global which is then sucked back into Node.js using the ($)Order Javascript API – it works very fast in fact!

    Perhaps you can see why I’m so excited about this Node.js interface (and why I’d love to see someone do an equivalent for the open source GT.M)!

    Rob

  6. BTW there are some benchmark results posted on the GlobalsDB forum (http://globalsdb.org). The GlobalsDB engine is basically the core global database engine taken out of Cache, so the performance figures for GlobalsDB/Node.js should be the same as for Cache/Node.js.

    Summary: it’s *very* fast!

  7. Thanks Rob

    Appreciate these posts, hope they open up your important work to an ever wider audience.

    Would like to pose a related question as to how you see the role of Javascript in future?

    From a business change point of view, an accessible and agile technical stack is very appealing.
    It occurs to me that having a single language up and down that stack could/should make for a more productive environment than currently pervades the industry.

    Could you comment on how you see the role of Javascript in future and the role of Node.JS therein… which should help put into context the role of the power of the NoSQL DB tools you mention.
    (Perhaps that might form the basis of another very useful post, but might also help explain the potential important role of Node.Js to more of us here.)

    A tiny snippet of related “foobar” code that might explain what the future might look like would be also of interest..

    Thanks!
    Tony

  8. Interesting! But it’s not very different from running an embedded key-value database like LevelDB or LMDB in your Node.js application?

    1. ngrilly

      Except that we’re not talking about an in-process database that is limited to key/value storage, but a full-blown document database – a persistent JSON store – with data retrieval/update granularity at the key/value level *within* a JSON document, and at any level within that document.

      Rob

      1. Yes, but implementing data retrieval and update inside the JSON document, on top of something like LMDB or LevelDB, is trivial when the database is in-process. I don’t understand what’s the added value here.

      2. From the LevelDB Google Code page on its limitations:

        – This is not a SQL database. It does not have a relational data model, it does not support SQL queries, and it has no support for indexes.
        – Only a single process (possibly multi-threaded) can access a particular database at a time.

        By comparison, GT.M has at least one SQL projection available, whilst it’s built-in to Cache. Both databases allow multi-process access – as exploited by the queue/child_process architecture of EWD.js (http://ewdjs.com) to obtain higher concurrency and therefore throughput/performance. Both GT.M and Cache support transactions. LevelDB looks way too limiting for most purposes I’d want to use it for.

        LMDB, however, does look interesting, though I’d still make the point that it is designed to be a simple key/value store, albeit a very fast one, and I’m sceptical about your assertion that it’s trivial to model a full document (JSON) database structure on top of it.

        My own particular interest is in healthcare IT where Cache in particular is very much a dominant technology. You’ll find Cache (and increasingly GT.M) being used to model key/value, columnar, graph, document, XML and relational databases, often all at the same time, and often such that data stored using one model is retrieved using another – that multi-facetted nature is pretty interesting and very powerful. It means, for example, that I can use EWD.js to create applications on top of old legacy code (of which there’s a huge amount in healthcare!) based on a JSON document view of data that was originally modelled in a completely different way, without having to do any modifications to the legacy application. In this environment, considering using something like LMDB would be out of the question as it would require massive rewriting of legacy code which would just not be cost-effective and way too high-risk.

        However, for new applications, I’d be interested to see what LMDB is like. Equally, from your side, I’d suggest you take a more detailed look at GT.M and Cache, beyond the use to which I use it within EWD.js – see http://www.mgateway.com/docs/universalNoSQL.pdf for starters.

        Rob

  9. Thomas in Doubt · · Reply

    “Caché runs in-process with Node.js, so there is no network-layer bottleneck. It’s blindingly fast. It’s such a close binding that Javascript becomes the scripting language for Caché;”

    This part really catched my eye. Also I went to read GT.M documents, it did not tell anything about how does it support replication / high availability. Everything that was told was really vague repeating, fast, proven, simple, flexible without telling what do you mean by that or what kind of DB setup you usually use with this solution.

    Does this mean that your DB is actually always local with your node.js instance? Is this solution really meant to serve one process at the time or few processes on single machine? If not I’m not getting it how there is no network-layer bottleneck. If it is I’m not getting how that kind of solution would be truly scalable horizontally across multiple servers.

    Nevertheless agreed that Node rocks, and reading this was interesting.

  10. Thomas

    Thanks for your comment.

    Scaling out an EWD.js based service can be achieved in many ways

    – consider a single EWD.js system as a Mumps database with integrated web server / web service interface – any HTTP(S) based techniques can be then used – eg using a load-balancer in front of a farm of EWD.js systems

    – in the case of the Cache database technology, you can use its ECP networking to scale out at the database server – database server level (ie behind the EWD.js/Node.js layer). You can map GT.M Globals across networked servers – sort of the equivalent to Cache’s capability, but done at the OS level.

    – you can use the EWD REST server (https://github.com/robtweed/ewdrest) to front-end EWD.js systems – see the latter set of slides in this presentation:

    http://gradvs1.mgateway.com/download/EWDjs-VistANovo.pdf

    Rob

  11. Sidney Levy · · Reply

    “Javascript, it turns out, has many of the characteristics I’ve always loved about the Mumps language” After reading your very interesting article, why do we have to change the scripting language. If Mumps is so close to js and comes natively with its DB with all the functions to manipulate it, let’s write it in Mumps instead of Node.js. What are the avantages of Node.js over the Mumps scripting language?

    1. Hi Sidney

      Glad you found the article interesting. Probably the best answer to your question is provided in my “Phoenix” set of articles which starts with this (follow the link at the end of each article to get to the next – 3 parts in all): https://robtweed.wordpress.com/2013/01/22/can-a-phoenix-rise-from-the-ashes-of-mumps/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: