What is Mesos ?
In layman’s term, Imagine a busy airport.
Airplanes are constantly taking off and landing.
There are multiple runways, and an airport dispatcher is assigning time-slots to airplanes to land or takeoff.
So Mesos is the airport dispatcher, runways are compute nodes, airplanes are compute tasks, and frameworks like Hadoop, Spark and Google Kubernetes are airlines companies.
In technical terms, Apache Mesos is the first open source cluster manager that handles the workload efficiently in distributed environment through dynamic resource sharing and isolation. This means that you can run any distributed application i.e spark, hadoop etc., which requires clustered resources.
It sits between the application layer and the operating system and makes it easier to deploy and manage applications in large-scale clustered environments more efficiently.
Mesos allows multiple services to scale and utilise a shared pool of servers more efficiently. The key idea behind the Mesos is to turn your data center into one very large computer.
Apache Mesos is the opposite of virtualization because in virtualization one physical resource is divided into multiple virtual resources, while in Mesos multiple physical resources are clubbed into a single virtual resource.
Who is using it?
Prominent users of Mesos include Twitter, Airbnb, MediaCrossing, Xogito and Categorize. Airbnb uses Mesos to manager their big data infrastructure.
Mesos is leveraging features of modern kernels for resource isolation, prioritisation, limiting and accounting. This is normally done by cgroups in Linux, zones in Solaris. Mesos provide resources isolation for CPU, memory, I/O, file system , etc. It is also possible to use Linux containers but current isolation support for Linux container in Mesos is limited to only CPU and memory.
Architecture of Mesos:
Mesos master is the heart of the cluster. It guarantees that the cluster will be highly available. It hosts the primary user interface that provides information about the resources available in the cluster. The master is a central source of all running task, it stores in memory all the data related to the task. For the completed task, there is only fixed amount of memory available, thus allowing the master to serve the user interface and data about the task with the minimal latency.
The Mesos Agent holds and manages the container that hosts the executor (all things runs inside a container in Mesos). It manages the communication between the local executor and Mesos master, thus agent acts as an intermediate between them. The Mesos agent publishes the information related to the host they are running in, including data about running task and executors, available resources of the host and other metadata. It guarantees the delivery of status update of the tasks to the schedulers.
Mesos Framework has two parts: The Scheduler and The Executor. The Scheduler registers itself in the Mesos master, and in turn gets the unique framework id. It is the responsibility of scheduler to launch task when the resource requirement and constraints match with received offer the Mesos master. It is also responsible for handling task failures and errors. The executor executes the task launched by the scheduler and notifies back the status of each task.
We can also write our custom framework. To write your own framework
Write your scheduler inheriting shceduler class.
write your framework executor by inheriting executor class.
How Mesos Works?
- The agent 1 informs the master about its availability that it has 4cpu and 4 GB of memory available. The master then cites the allocation policy module.
- The master sends a resource offer describing what is available on agent 1 to framework 1.
- The framework’s scheduler replies to the master with information about two tasks to run on the agent, using <2 CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the second task.
- Finally, The master sends the task to the agent, which allocates appropriate resource to the framework Executor. If space is free the other framework can also use the spare space and resources.
- It provides Web UI to monitor cluster state.
- Multi Resource Scheduling.
- Fault tolerance and Highly available.
- Ability to share resources across many frameworks.
Mesos vs Yarn
Both systems have the same goal: allowing you to share a large cluster of machines between different frameworks.
- YARN only handles memory scheduling (e.g. you request x containers of y MB each) and Mesos handles both memory and CPU scheduling.
- Mesos uses Linux container groups (), and YARN uses simple unix processes.
- Mesos authentication module uses the Cyrus SASL library. SASL is a flexible framework that allows two endpoints to authenticate with each other using a variety of methods. By default, Mesos uses CRAM-MD5 authentication and YARN uses Kerberos as its authentication and authorization mechanism,Security features of Hadoop consist of authentication, service level authorization, authentication for Web consoles and data confidenciality.
The aim of this blog is to introduce you with Mesos, what it is and how is it better than Yarn. In the next blog, we will explore more Mesos. So, stay tuned 🙂
Please feel free to suggest or comment!