The INDIGO-DataCloud PaaS relies on Apache Mesos for:
- managed service deployment
- user applications execution
The instantiation of the high-available Mesos cluster is managed by the INDIGO Orchestrator in a fully automated way as soon as a user request described by a TOSCA template is submitted. Once the cluster is up and running, it can be re-used for successive requests.
Mesos is able to manage cluster resources (cpu, mem) providing isolation and sharing across distributed applications (frameworks)
Sophisticated two-level scheduling and efficient resource isolation are the key-features of the Mesos middleware that are exploited in the INDIGO PaaS, in order to run different workloads (long-running services, batch jobs, etc) on the same resources while preserving isolation and prioritizing their execution.
INDIGO PaaS uses:
- Marathon to deploy, monitor and scale Long-Running services, ensuring that they are always up and running.
- Chronos to run user applications (jobs), taking care of fetching input data, handling dependencies among jobs, rescheduling failed jobs.
- Automatic deployment through Ansible recipes embedded in TOSCA and HOT templates
- All the services run in docker containers;
- High-availability of the cluster components:
- Leader election among master nodes managed by Zookeeper;
- HA Load-balancing; -Service discovery through Consul that provides also DNS functionality and health checks;
- services are automatically registered in Consul as soon as they are deployed on the cluster
- The external access to the deployed services is ensured through load-balancers in HA (unique entrypoint: cluster Virtual IP)
- Cluster elasticity and application auto-scaling through CLUES plugin
- Ansible roles and TOSCA templates for cluster set-up featuring high-availability, service-discovery and load-balancing;
- Integration with the INDIGO Orchestrator
- Job submission and service deployment requests are described through TOSCA templates
- Definition of custom TOSCA types for describing Chronos jobs and Marathon application
- Cluster elasticity through EC3/CLUES plugin
- Zabbix monitoring probes for Mesos, Marathon and Chronos;
The core components are:
- Consul for service discovery
- Mesos cluster manager for efficient resource isolation and sharing across distributed services
- Chronos a distributed task scheduler
- Marathon for cluster management of long running containerized services
- Docker container runtime
- mesos-consul populating Consul service discovery with Mesos tasks
- marathon-consul bridging Marathon information to Consul KV
- haproxy-consul for dynamic haproxy configuration using Consul
These components are distributed on the cluster nodes as shown in the diagram below.
- Master nodes
- On every master node the following (dockerized) components run: zookeeper, mesos master, consul server, marathon, chronos
- Slave nodes
- On every slave node the following (dockerized) components run: mesos slave, consul agent
- On the two load-balancers the following (dockerized) components run: keepalived and haproxy-consul. keepalived ensures the high-availability of the load-balancer managing the cluster Virtual IP.
The following roles are available in Ansible Galaxy:
- indigo-dc.marathon: -source: https://github.com/indigo-dc/ansible-role-marathon
These ansible roles can be installed through ansible-galaxy command:
ansible-galaxy install indigo-dc.rolename
- Please use the INDIGO - DataCloud CatchAll GGUS Support Unit