Blog

Getting Concord on the DC/OS Universe

Deploying distributed systems can be a frustrating task. Each machine in your cluster needs to be provisioned specifically for its role before your programs can even run. Tools such as Docker, Ansible, Chef, etc. do help, but they still exist at a lower level of abstraction than the ideal case. Not only that but each application requires expert level knowledge of its internals to be configured properly. Preferably users would desire a single click solution, and now this is possible with Mesosphere DC/OS.

DC/OS is a distributed systems operating system built atop of Apache Mesos. The idea is that Mesos is the ‘kernel’ of your distributed system, abstracting away computing resources such as CPU’s, memory, disk, etc., and providing distributed systems programs with APIs for resource management and scheduling. DC/OS is the operating system existing on top of this ‘kernel’. This additional abstraction provides many benefits with regards to common issues that arise when developing and maintaining distributed applications. Issues related to scheduling are solved by using Marathon, an all in one production grade container orchestration platform that ships with DC/OS. Marathon is highly available, can bind persistent storage to an application, performs service discovery (via Mesos-DNS), load balancing, constraints checking, exposes a metrics API, and more. Provisioning and deployments are handled by containerizing your deployments and packaging through the Mesosphere Universe. Interaction with the cluster is performed through a super fresh web UI or its command-line analogue – the dcos command. These are some of the features of the Mesosphere DC/OS, all of which make DC/OS an excellent distribution channel. It allows the developer to focus on business logic without worrying about expert level knowledge of each framework. It delegates the tasks of configuring to experts so you can rest assured that your services won’t suddenly crash.

dcos

DC/OS Web Interface

As mentioned earlier, distributing packages on top of DC/OS clusters is done through something called the Mesosphere Universe, which in essence is just a github repository. This repository is composed of curated packages, all of which have been vetted by the team at Mesosphere. In order to include your own package for publishing to the Universe you must fork the Mesosphere Universe repo on github and submit a pull request with changes that define how your package is to be deployed. Publishing to the Mesosphere Universe would increase exposure and give many programmers who want to test drive Concord the ability to do so with just a single click.

concord_dcos

Concord on the Mesosphere Universe

The Mesosphere Universe installs your package by deploying it through Marathon. Once we learned this, we felt that it would be best to support all types of deployments with Marathon including users who use the Marathon scheduler without DC/OS. This led to the creation of the concord marathon command, which creates a deployable configuration file for Marathon based on any given command line options. To use this to create a deployable DC/OS package, you only need to modify this file to contain hooks for variables that should be configurable. This works because the Universe expects a .json.mustache file and a corresponding config.json file in order to define what variables in a deployment are configurable. Once that’s finished, bundle these files along with two other JSON files that represent package metadata and that’s all there is to it. An additional optional feature is the ability for the Universe to install a python pip installable CLI associated with your program through the DC/OS CLI. For Concord, we must take advantage of this feature as our CLI is currently the only supported way to interact with Concord topologies on the cluster. Once installed invoke the command through the DC/OS CLI like so:

$ dcos <your-package-name>

Unfortunately there isn’t much documentation on how to properly package your CLI for DC/OS, however there are a ton of helpful individuals from the Mesosphere team on the #DC/OSCommunity slack channel. For additional help we also downloaded other packages CLI’s and checked out their source code. We learned that your program name must start with dcos-. If you don’t do this the DC/OS CLI will fail to locate and execute your CLI! Secondly, your command must support a --config-schema switch which prints a JSON schema representation of possible configuration options. With this schema DC/OS will let program defaults be stored with the dcos config command. Defaults are prefixed with your package name like this:

$ dcos concord --config-schema
{
  "$schema": "http://json-schema.org/schema#",
  "type": "object",
  "properties": {
    "zookeeper_hosts": {
      "type": "string",
      "title": "Zookeeper hosts URL",
      "description": "Zookeeper url for Concord.",
      "default": "localhost:2181"
    },
    "zookeeper_path": {
      "type": "string",
      "pattern": "^\/.*$",
      "title": "Zookeeper path",
      "description": "Concord storage path inside zookeeper.",
      "default": "/concord"
    }
  },
  "additionalProperties": false
}
$ dcos config set concord.zookeeper_path /concord

All of the enabled DC/OS configuration properties can be accessed programmatically within your CLI through the DC/OS python module (aptly named dcos). Importing this module also allows you to make authenticated HTTP requests so you can communicate with your service inside of the cluster. Unfortunately the Concord CLI doesn’t communicate with the scheduler over HTTP so we’ll have to create a REST service that emulates the functionality of our CLI. The Concord scheduler has an Apache Thrift server listening for requests to deploy or register computations. Since Concord’s communication protocol is performed using Thrift, the decision to use it everywhere was chosen for consistency, simplicity, and to increase code reuse. For the time being we have set up a dockerized environment that will let you manage your Concord DC/OS infrastructure from within a node inside your cluster.

Overall DC/OS integration was a painless experience. It allows Concord to be reached by a whole new set of users who wouldn’t normally try the product because of provisioning nightmares. Our aim is to make distributed systems engineering available to the average programmer and DC/OS has been successful in enabling us for doing so. Currently DC/OS is open source and you can give it a test drive on most cloud providers here. Once installed make sure to look for Concord under the DC/OS Universe tab.

For more information on getting started and our full DC/OS integration instructions check out our DC/OS installation docs.