Announcing Spire for Mongo

Drawn to Scale announces Spire for Mongo


Today, we’re announcing that we’ve ported MongoDB onto Spire as a platform. What this means is:

1. You can easily scale your MongoDB cluster to hundreds of terabytes
2. You don’t need to change a line of code in your app to make it scale
3. You can use ANSI SQL (yes, joins), Mongo queries, and Hadoop *on the same data*.

By changing the storage and execution engine behind Mongo, Spire also alleviates problems such as difficult backups, indexes being constrained by memory, using only a single index in queries, and the ‘global/collection lock’…as well as others. V1.0 of Spire for Mongo focuses on CRUD and Aggregations.

Spire for Mongo is available to Drawn to Scale beta customers. Sign up to become a beta user now.

Customers are our inspiration and Devrim from Koding says it best:

“Koding is exchanging ~15M messages a day and currently 90% of them read or write from MongoDB. Growing with this scale, hitting 1M users will mean a Mongo nightmare, which is not that far away from us. What Drawn to Scale is doing with Spire for Mongo will allow us to massively scale Koding to hundreds of terabytes of data, without changing our application code.” says Devrim Yasar, CEO of Koding.

Why did we do this?

We firmly believe SQL is the ‘Rosetta Stone’ of data … but many app developers, especially at rapidly growing startups, love the simplicity and speed of Mongo. It helps them develop apps quickly instead of building ORMs or hand-coding stored procedures.

Yet MongoDB is a sharded database, and not built for large-scale distributed environments. There are only a handful of multi-terabyte Mongo deployments. Therefore, when apps need to handle ‘Big Data’, developers find themselves rewriting their apps to use a sharded SQL db, or a NoSQL key-value store.

MongoQL is also great because unlike SQL, there is only one flavor. So we can support that version, and developers can effortlessly use both Spire and MongoDB aside each other without having to change app code.

How did we do this?


Spire Mongo Diagram

Something we have not mentioned: Spire was not built to be a distributed SQL database. Spire is a platform for *building* distributed databases. Every aspect of Spire is made to be distributed from the ground up: storage, schemas, indexing, execution, planning, optimiziation, etc. But they’re all abstracted from the query language. Our query execution engine has its own intermediate language that describes low-level actions (scan data, filter data, spool data, transport data to another node).

Since any query can be described with a handful of low-level actions, the only difference between a SQL query and a Mongo query is the parser and some storage metadata. Everything else, from optimization to execution, is exactly the same.

More tehcnical details and examples will be presented in the following weeks.

It comes back to Interface vs. Infrastructure. SQL and MongoQL are merely interfaces to data — the infrastructure running those queries is more important. MySQL and Mongo are both single-node/shardable infrastructures. They work fantastically and efficiently i3vidual server, but to maximize throughput and minimize latency in large clusters.


What does it mean?

If you’re a MongoDB user and you’re expecting to have hundreds of gigabytes to a few terabytes of data, stock Mongo is all you need. But if you find your resources being stretched, and you can’t add more users or features because it’ll bring the database down — Contact us and we’ll talk about how Spire can help you scale.

Is there a Database in Big Data Heaven? Understanding the world of SQL on Hadoop.

(co-authored by John Overton, VP of BizDev/Sales)

Is there a Database in Big Data Heaven?


2012 was certainly a watershed year of widespread adoption of Hadoop — and with it, the emerging adoption of SQL on Hadoop. We have strong regard for all the work ISVs have done to educate the market and deliver first generation solutions. Many firms, from large enterprises to start-ups are, are now on the path of developing Big Data strategies that ultimately drive revenue, growth and differentiation. But Gartner’s Svetlana Sicular recently declared that Big Data is falling into what Gartner calls the “Trough of Disillusionment”…yet the good news is “Big data is moving from closets full of hidden Hadoop clusters into CEO’s corner offices”

The market is primed and Hadoop solutions are certainly demonstrating the ability to store and process data at scale — but there’s a lack of satisfaction from customers.  Expectations of being able to run operational applications on Hadoop are not being met. So if Hadoop/HBase is heaven for large data sets, is there a database in Big Data Heaven?

To answer this, Drawn to Scale has been working on the hard problem of building an operational infrastructure on Hadoop with a SQL interface.

Currently, Hadoop is stuck in the world of Analytical applications. Yet there is a much larger need for user-facing apps with real-time reads and writes — like powering websites and mobile apps.  Developers now know that Hadoop scales data storage cheaply, but building applications without actual database functionality is a chasm of complexity.  It’s a clear example of “The Law of Conservation of Complexity” which states that every application has an inherent amount of irreducible complexity. This complexity cannot be wished away and has to be dealt with, either in product development or in user interaction.   A SQL interface on Hadoop infrastructure is still not a database.

The three fundamental aspects keeping Hadoop from being used operationally are:
1. MapReduce and HBase are low-level programming interfaces
2. MapReduce is a batch-oriented processing system
3. MapReduce is not built for concurrency




SQL and the RDBMS have been the standard way we interact with data for decades — these concepts have become almost synonymous. SQL is the language to interface and manipulate data, and the RDBMS the storage and execution infrastructure.

However, you can have a SQL interface without a database infrastructure behind it. Today, we’re seeing a similar approach with Hive, Phoenix, and Cloudera’s Impala — SQL interfaces focused on non-Operational infrastructure whose goal is to make data manipulation faster and easier for analytical use cases.

Think of the concept of a Vending Machine. The vending machine is the Interface to food (data) and the warehouse is where all the rest of the food is actually stored.

  1. Vending machines store only a sub-set of data that’s in the warehouse
  2. Vending machines have to be re-stocked in batch (by delivery trucks)
  3. Vending machines service single or few users at a time. (Analysts insert coins here)

Let’s take a look at the open source projects — Hive, Impala, and Phoenix — to better understand their Infrastructures relative to their SQL Interfaces.


Fly on, Albatross — a survey of current SQL on Hadoop solutions


Hive – A SQL Interface to MapReduce
Interface: HiveQL
Infrastructure: MapReduce
Storage: HadoopFS


Hive was one of the first non-native interfaces to MapReduce, Hadoop’s batch processing framework. Developers, not analysts, write scripts with the interface of HiveQL.  Scripts are then compiled into MapReduce jobs.  This cycle repeats for every new query manipulation needed preventing the analyst from ever having a free form conversation with the data.

It’s also the least performant of the three due to the way MapReduce execution works — it scans over an entire data set, often with little structure. It then performs computations, moves the results of the computation across a network, performs more computations, moves it across the network again, and then stores results.

Hive makes it easier to write pure Java code against a dataset, the computations are highly customizable, if you are a developer. Results are typically stored in a flat file format in the Hadoop distributed filesystem.

If Hive were a vending machine, a developer would make the request to an empty vending machine, then wait for a truck to deliver the requested data in the structure requested.  Not bad if you have the time to wait.  The chasm of complexity to get to operational functionality is just too large to even try.


Impala – SQL for low-latency data warehousing
Interface: Similar to HiveQL
Infrastructure: Massively Parallel Processing (MPP)
Storage: HadoopFS (occasionally HBase)


Cloudera’s Impala is an implementation of Google’s Dremel. Dremel relies on massive parallelization and brute force to arrive at answers to analytics queries in low minutes to seconds.  Similar to an MPP data warehouse, queries in Impala originate at a client node. This query is then sent to *every* data storage node which stores part of the dataset. Results are partially aggregated locally, then combined on the client machine.

Imagine the overhead of every query hitting every machine in an operational context.  Impala would fall down unable to handle more than a handful of concurrent requests.

Also unlike an RDBMS, Impala lacks indexes, optimization, and a true distributed execution environment.  Without indexes, many queries need to scan over the entire dataset to return results, which often results in *every query* running on *every machine*.  This approach leads to unpredictable performance based on query type and data set size.  In addition, without metrics gathering and optimization, Impala cannot determine how to minimize query execution time. This is especially important when it comes to filter and joins.

Without a distributed execution environment that can *move data between nodes*, operations like JOINs become much more difficult. Since joins require sorting and then matching aspects of a dataset together, the Impala client will need to bring large portions of the entire dataset to a single client (one server),  and then ‘join’ the information there. The dataset will also need to fit in memory.

We’re going to need a bigger vending machine and the operational viability is limited to patient internal users.


Phoenix – A SQL Interface to HBase
Interface: Small subset of SQL
Infrastructure: Coprocessor and Client-Side execution and filtering
Storage: HBase (key-value store)


Phoenix is a new open source project from Salesforce that provides a SQL interface to data that is stored in HBase. Claiming higher performance than Impala or MapReduce, Phoenix abstracts away a lot of tricky concepts with HBase, such as serialization, filters, and a basic ability to “push computation to the data” so network traffic is minimized. However, Phoenix lacks many aspects of functionality an operational database requires.

Similar to Impala, Phoenix cannot handle more than a handful of concurrent requests, unless the queries are simple, because it lacks indexes, optimization, and distributed execution.

The execution environment for Phoenix relies on HBase Coprocessors and a client which combines results from multiple machines. These provide ‘hooks’ into the HBase engine and allows one to run code on events such as scanning for data or writing data. Therefore, Phoenix makes running queries like “Group all my employees by department” simpler, by retrieving data, aggregating it on the same machine it’s retrieved from, then sending that smaller dataset over the network — there’s the delivery truck again.

This vending machine can only store so much before it has to be restocked, and meanwhile users are lining up waiting for access.

Phoenix provides a good story for analytics and simple applications as a SQL Interface bolt on to HBase.  The infrastructure, however, limits its viability to only analytical application verticals, but anyone seeking operational capabilities will need to look elsewhere.


Pearls Before Swine


Climbing out of the trough requires vendors to enable firms to realize deeper business value with operational Big Data applications. This takes a radically new approach beyond just adding a SQL interface to Hadoop.  One that focuses on the infrastructure behind the SQL.  We modeled Spire after Google’s F1 — to combine the scale of BigTable with the simplicity of SQL, and a complete distributed execution environment.

We took this construct and applied it to the world of Big Data, and have developed a true operational database with purpose built infrastructure to provide:

  • Complete SQL interface for data storage and retrieval
  • Distributed indexing and query execution for predictable and manageable performance as user concurrency and data set grows
  • Optimization for rapid queries and joins to enable small reads and small writes against the entire data set in real time

Spire allows you to put the vending machine in the back of the building (for the analyst), while opening the doors of the warehouse to thousands of users with access to all the data in the warehouse, make changes to the data and see the changes made by other users in real time.  It’s like trying to do your grocery shopping through a vending machine vs. walking into a Costco store.

If your application requires an operational infrastructure with a SQL interface capable of thousands of reads/writes a second at Hadoop scale, check out Spire. 

Spire is in Limited Availability

A quick note:


I’m ecstatic to announce that after years of effort and research, Spire, the first real-time SQL Database for Hadoop, is in Limited Availability.

What this means is that you can start testing it today on real clusters, with real data.

If you want to try it, drop us a line. Requests have been overwhelming over the past few weeks, so we’re prioritizing based on who needs us the most first.

Over the next few months, we’re going to be giving more details on our architecture. While we speak SQL to the user, underneath is a resilient, fast, flexible distributed computational engine which is far less rigid than any database today.


Why We Chose MapR to build a Real-Time Database for Hadoop

(Want to try Spire, the real-time SQL database for Hadoop? Contact us.)

Today, we’re announcing that Drawn to Scale is the first company to publicly partner with MapR to redistribute M3 as part of our database, Spire.

I have always dreamt of the possibilities of a real-time operational database based on the Bigtable model.  In my previous work on HBase I have run into two major problems.  First is that developers have a hard time understanding HBase “schema” design, and frequently just wished for even basic SQL support.  And the second is that running real-time HBase is difficult to do on top of Hadoop in a performance and latency sensitive way.  At Drawn to Scale we fix these problems once and for all.  We solve the first problem with Spire — our indexing, schema, and query engine on top of the HBase database. But to solve the second problem, which has elements that go deep into the underlying Hadoop infrastructure, we are enlisting the aid of our new technology partner, MapR.

We are working with MapR to deploy their version of a HBase-compatible distributed filesystem embedded in Spire, and by doing so we avoid many of the problems that previously plagued running a real time HBase datastore. One of these problems is the disappointing and low random read performance, especially under concurrent load.  Concurrent load can trigger weird and difficult to tune thread-based configuration scalability limits, causing failure.  Even worse is the dependency on a single node (and difficult to distribute) metadata function.  And finally, running a production website demands backup and snapshot options.

To run a user-responsive near-time/next-click site, you need a low latency operational database that handles a large amount of concurrency.  There are two elements to solving this– one is making sure that random-read requests are sent down to the disk layer in a parallel fashion, and the second is making sure that there is no additional delay for retrieving data from the disk or kernel cache.  By running Spire and therefore HBase on top of MapR we take advantage of their optimized C++ file server that is highly asynchronous.  Combined with an multiplexed socket reuse model, Spire dispatches many random read requests at once without having to worry about socket-per-read or thread-per-read and running out of resources.  This allows the MapR file server to handle thousands or even tens of thousands open files and concurrent disk operations.

In addition, by providing an optimized local-read case where the client is co-located with the file-server (common for HBase), the local fileserver is able to pass data to the HBase client process shared memory, saving on data copies and TCP socket overhead. The fileserver is also pinned to a single CPU and uses a fixed allocation of RAM to provide caching.  This provides predictable and constrained resource consumption and helps avoid several nasty failure scenarios.

Beyond performance, a real-time operational database also needs to also be fully available.  Traditionally all filesystem metadata is store in the ram of a single machine, which makes for a simple, but difficult to distribute, metadata function.  This also exposes a cluster to a massive single point of failure.  By leveraging MapR’s filesystem, we get a distributed metadata function where the file, directory, and storage mapping information is stored across the cluster, much in the same way as ext4 metadata is not concentrated in a single location on disk.  In addition, by using extent-based allocation, a file’s location is identified by which extent is it stored in, and the task of keeping track of replication and file placement is significantly reduced by orders of magnitude.  This makes for easier fail-over, and also since the underlying data is ultimately stored replicated in the filesystem anyway, it makes for a HA solution that is batteries-included.  There are no external dependencies whatsoever, and thus makes deployment and management substantially easier.

In addition to managing the software deployment, you need to also be able to manage the data itself.  Typically large-scale database dump jobs or hacky copies have been a typical way to solve the data backup problem.  A major feature provided by the MapR filesystem is a snapshot which can create a point in time recovery of a Spire installation.  Once you have this consistent snapshot, you can do pretty much any typical database snapshot activity – copy to a new cluster, archive off to permanent storage, restore, etc.  And since the snapshot is copy-on-write, it’s easy going on your disk space usage.

Any one of these features would be an excellent addition to help solve the problem of running a HBase datastore in a real time environment.  Having all 3 in a ready to go package is a compelling offer that I could not ignore.  We are looking forward to bringing these features, and more, to our Spire customers.

Contact us to learn more about our SQL database for operational, interactive workloads by subscribing to our beta program to see if Spire solves your big data challenges!

The first real-time database for Hadoop. Demo’d at Hadoop Summit!

At Hadoop Summit, we will be publicly demoing (and premiering) Spire, the real-time database for Hadoop, at our booth.

In just three months’ time, we’ve been hauling in customers and pouring out code at a frantic pace. We’re a little ahead of schedule, so we wanted to give you a treat.

We have a preview release of our JDBC driver and we’ve wired it up to Pentaho, an open-source BI tool. We also have a fully interactive console.

This is real SQL, on real data (stock market trades), backed by our distributed indexes, single-path query engine, and more. You NEED to see it. Especially if you’ve read about Google’s F1 RDBMS, which is very similar to Spire.

And, we have an EPIC t-shirt from a well-regarded metal album cover artist…you need to get one of these before they disappear.



See you at the Drawn to Scale booth at Hadoop Summit.


We’re hiring. Here’s why we’re different.

Possibly the best thing about having a startup is finally hiring exactly the people you admire and want to learn from. It took two years to hire Ryan Rawson and Alex Newman. It was totally worth it.

And you should work for us too.  Just drop a line to …  a sentence saying “hi”.

So, why the hell should you work here instead of some companies that promise to IPO soon?

Because we’re changing the data infrastructure industry and you’re going to be essential.

You’re going to be the first technical non-founder. That’s pretty amazing. You’ll be integral to the product’s success…it won’t ship without you. And we’ll listen when you want to build something your way, using the tools you want.

You’re going to get to build something from the ground up, the way *you* want it. How many times have you raged against your database because it’s horrible/slow/inflexible? How many late-night phone calls or times spent optimizing joins instead of building cool stuff? You get the chance to build something you’ve wanted your entire life.

We don’t hire jerks. We surround ourselves with the folks we want to be like. You’re going to be in a free-thinking, judgment-free environment. This is rather rare in this part of the startup world, it seems.

Just drop a line to … just a sentence saying “hi”.

You’ll build software that *ships* and is the most important part of our customers’ infrastructure. You’ll get to talk to those customers so you know you’re not wasting your time with wrong features. You’re enabling entire new sectors of business.

We’re obsessed with building pragmatic things that work in “the real world” and joining them with the most cutting-edge distributed systems research. We’ve built and run some of the largest companies and infrastructures: Sun, Amazon, Google, Intel, and more. Even the CEO codes almost every day.

Learn more at the careers page.

Engineer: Database Core / Distributed Systems: San Francisco 

Help our core team build a database from the ground up. Finally, you can do things “the way they should be”. Instead of a db from the 1980′s, we’re creating a platform for modern, real-time applications.

Here are some things you may enjoy doing or learning about:

  • Building query planners and optimizers
  • Compiler design
  • Functional programming (Scala, Clojure, etc.)
  • Distributed systems architecture: failover, replication
  • JVM tuning and performance hacks
  • Turning academic research into reality
  • Resilient systems for the real world

Engineer: Operations and Automation: San Francisco

Yes, this is a “DevOps” role. If you like coding *and* systems work, you’re going to enjoy this. You’ll be the one responsible for building clusters that heal themselves and deploy seamlessly in the cloud or customer sites.

    • Cluster automation
    • Deployment frameworks like Chef, Puppet, CFEngine
    • Building monitoring tools that you enjoy using
    • Upgrading and recovering from failure with no downtime
    • How to make Linux behave
    • Hadoop/HBase/BigTable/other distributed systems
    • And perhaps a bit of UX hackin’

Just drop a line to … just a sentence saying “hi”.