You can think of Concord as the processing layer that connects all of your data sources (http servers, message queues, logs), business logic and data stores (MySQL, Postgres, MongoDB, HBase). Once Concord is installed, users can write code in a high-level, logical layer, leaving Concord to take care of distributing work throughout a cluster of computers.
How Does it Work?
A Concord cluster is composed of a few systems:
- Mesos Master
- Mesos Agent
- Concord Scheduler
Concord uses Apache Zookeeper as a reliable way to store metadata about the current state of the cluster. This data is then used to reliably recover in the event of a failure in the Concord Scheduler.
Mesos Master tracks the status of Mesos Agent instances and tasks running on them. It also provisions and kills tasks.
Concord scheduler manages the topology, defining the relationship between operators. It interfaces with the Mesos Master to schedule tasks and scale them up and down.
The Mesos Agent runs and oversees operators scheduled by the Concord Scheduler. Each operator is executed inside a Linux container.
The Concord Executor includes the following:
Tracing Engine Concord statistically samples messages flowing through the system and measures latency and throughput. This helps Concord users to determine potential bottlenecks and diagnose errors.
Operator User code built on the Concord client libraries
Router After a message is processed, the Concord router determines where to send the result. Concord supports multiple routing options, including:
Shuffle Routing Randomly routes a message to an instance of a given operator. It is the most efficient routing option, useful when an operator is mapping or filtering.
Group By Routing Ensures that messages with the same key are always sent to the same instance of an operator. Grouping is necessary for aggregations over a specific key, for example user IDs.