DC/OS: Dynamic Resourcing

Reading Time: 4 minutes

Suppose that we are running a service on a node and it was producing some date to be used later. Then your node was restarted. What will happen to it? The data will evaporate and on the next boot, the machine will be empty.

Let’s extend the example consider that we have to reserve some resources on the node. How will we ever do that? If we did it, how will we ensure that it will available? Also, Is there a way to let the resource be used by some other application if it is not used?The last problem was a challenge in DC/OS but that challenge plus all above it are solved by Dynamic Resourcing in DC/OS.


Why does the application want resources? To create stateful services, persistent storage is required to store volumes of information gathered. These resources are present until explicitly destroyed.

Usually, our application (stateless) runs on sandbox with no storage but for stateful services, we need to integrate persistent storage with the sandbox on which the service will run. DC/OS allows persistent volumes to be mounted into tasks’ Mesos sandbox. This is useful for databases and caching. Thus enabling better response in the next boot.


Mesos initially provided mechanisms to reserve resources in specific slaves but that reservation was static which enabled the operator to reserve resources on startup. But this was extended and transformed into a dynamic reservation. which enabled the operators and authorized frameworks to dynamically reserve resources in the cluster.

In both types of reservation, resources are reserved for roles. This role can be referred to as a user, groups within an organization. But it could also represent a service,  framework, etc. Schedulers are subscribed to these roles in order to receive resources and schedule work on behalf of resource consumer(s) they are servicing.

We will be looking discussing both of these types.

Static Reservation

An operator can configure a slave with resources reserved for a role. For example, suppose we have 12 CPUs and 6144 MB of RAM available on a slave and that we want to reserve 8 CPUs and 4096 MB of RAM for some role. But the only problem with this reservation is that if the requirement drops or increases we require to drain and restart the slave with a new configuration.

Dynamic Reservation

It is often better to label each resource on the slave as unreserved and manage reservation dynamically via the master. Static reservations cannot be unreserved after use. However, With Dynamic Reservation the operators and authorized framework can free up the resource, Hence allowing flexibility on the run.

  • Offer::Operation::Reserve and messages Offer::Operation::Unreserve are available for frameworks to send back as a response to a resource offer. (relate here)
  • /reserve and /unreserve HTTP endpoints allow operators to manage dynamic reservations through the master.

There might be an interesting situation where two dynamic reservations are made for the same role at a single node. What then? reservations will be combined by adding together the resources reserved by each request. This results in a single reserved resource at the node.  Similarly, some resources can be set free but not all that have been reserved for a role at a node.

Here’s a warning for ya

Dynamic Reservations cannot be unreserved if a task is using it. If there’s a persistent volume has been created out of those reserved resources. You need to destroy that volume first before you unreserve.

There’s a twist in this story though, It starts from here.

Reservation Refinement

Hierarchical roles enable the delegation down a hierarchy. Reservations Refinement is the mechanism with which reservations are delegated down the hierarchy. Suppose our hierarchy is like this roleOne/roleTwo/roleThree.

suppose that roleOne has reserved some resources and now, these resources can be passed to roleTwo and roleThree also.  Suppose if roleOne owned the resources and they got delegated to roleThree. What will happen if the resources are unreserved / set free? They go back to the owner which in this case is roleOne.

Frameworks explicitly enable this property when they want to achieve the refinement using the property RESERVATION_REFINEMENT.

We talked about merging of resources on a node. LABELS are associated with each resource which behaves as the metadata of that resource. It explains the intended purpose for a portion of the resource that has been reserved.

Another Warning for ya

Two reservations with different labels cannot be combined even if the reservations are at the same slave and use the same role.

Persistent Storage

Mesos supports the creation of persistent storage from disk resources (statically or dynamically) reserved. This storage exists outside the task’s sandbox and persists on the node even after the task dies or exist. When the task exits, the resources can be offered back to the framework, so that they can launch the same task again, launch a recovery task, or launch a new task that consumes the previous task’s output as it’s input.

These storages allow persistent services such as Cassandra to store their data within Mesos. Otherwise, they would have to write the data to remote filesystem whose location must be well known.

I hope you find this blog helpful.



Written by 

Software Consultant