Open Source’s Main Advantage is NOT Cost

Open Source’s Main Advantage is NOT Cost

Let’s start with a trick question: How much does MySQL cost Facebook every year?

Depending on whom you ask, the answer ranges from “nothing” to “millions of dollars every year.” True, the MySQL software is free; it costs nothing for anyone to download it. What’s more, it’s free for anyone to make changes to suit their needs. (I can already sense Richard Stallman sneaking up behind me to fastidiously correct my egregious abuse of the word “free.”)

Operating MySQL, however, is another story.  For example, Facebook employs dozens of MySQL mad scientists to ensure that their MySQL instances keep up with its growth. In fact, they have modified MySQL so extensively that they’ve given their chimera database a name halfway between terrible developer marketing and a self-deprecating joke.

Last week at the Gartner BI and Analytics Summit in Las Vegas one of the participants told me that Facebook’s MySQL operation costs dwarf most companies’ Oracle contracts: retaining the exceptionally talented MySQL hackers means millions of dollars in salaries, RSUs and benefits. Point taken: Open source ain’t cheap. Relatively speaking, enterprise proprietary databases aren’t so expensive after all.

Indeed! But what’s an alternative for Facebook? It surely is not buying expensive commercial databases. No matter how many millions of dollars they spend on Oracle or Teradata, Facebook wouldn’t be able to get the kind of performance they get today from their MySQL. The real upshot of open source software is not cost savings but rather technical freedom to build on existing work. If you invest heavily in your software engineers, you can take an open source project and evolve it to something that no competitor can hope to build, let alone manage. This process might still take years and millions of dollars, but at least it’s possible — unlike with commercial, proprietary software that you cannot modify.

Docker vs. Heroku: A Terrible Comparison

To illustrate this point further, let’s look at Docker, the open source project du jour. One of the first applications of Docker to come out was Dokku (a name that clearly pays homage to Heroku), a minimal PaaS implemented with Docker and ~100 lines of shell script. When Dokku came out, developers immediately compared Docker to Heroku, the most popular PaaS.

Fast-forward almost two years, and although some people predicted Docker was going to crush Heroku, Heroku is still alive and well. This is not a surprise, since Docker versus Heroku is a misguided comparison — it is like comparing the MySQL software and operating MySQL.

Get Treasure Data blogs, news, use cases, and platform capabilities.

Thank you for subscribing to our blog!

The real value of Heroku is NOT containerizing apps. It is how Heroku delivers its core product value — by allowing developers to deploy apps quickly and reliably.

But it’s far from free to build and manage any serious software on Docker.

As much as I like and use Docker, you need to put in serious efforts to run it in production. As an active open source project, it’s a fast-moving target. Containers need to be orchestrated, and there are enough options to make the ecosystem both vibrant and overwhelming. If you are a developer trying to deploy a Rails e-commerce app, paying approximately $35 USD/month to Heroku is actually more affordable than trying to emulate the same setup using Docker on Digital Ocean.

Do you need to add Redis or Memcache? It’s a few clicks away on Heroku. With Docker deployed on your VMs,  you need to link your containers yourself or learn how to use one of the orchestration frameworks. How about logging? You can centralize your Docker container logs (learn how here), but it’s as easy as adding the Found Elasticsearch add-on on Heroku.

Just like Facebook with MySQL, if you decide to use Docker to build your own PaaS, you will need great engineers to manage it. Just like Facebook with MySQL, that might be the right technical decision for your company. But I guarantee you that more often than not, cost is not the right reason to opt to build your own infrastructure on open source software instead of buying more managed alternatives.

And it’s not just me saying this
What I have said here is nothing new. Many engineering leaders realized this tradeoff years ago. However, I am pleased to know that the larger IT world is catching up to this lesson.

Many organizations still launch into Hadoop and Big Data initiatives without sufficiently estimating all the costs involved. Read the new white paper, “Big Data Total Cost Of Ownership –  Evaluating Hard Costs and Options,” for a critical look at the hidden expenses and a breakdown of the hard costs of various implementation models over three years.

Kiyoto Tamura
Kiyoto Tamura
Kiyoto began his career in quantitative finance before making a transition into the startup world. A math nerd turned software engineer turned developer marketer, he enjoys postmodern literature, statistics, and a good cup of coffee.