Intro
Dkron - Distributed, fault tolerant job scheduling system
Welcome to the Dkron documentation! This is the reference guide on how to use Dkron. If you want a getting started guide refer to the getting started guide.
What is Dkron
Dkron is a distributed system to run scheduled jobs against a server or a group of servers of any size. One of the machines is the leader and the others will be followers. If the leader fails or becomes unreachable, any other one will take over and reschedule all jobs to keep the system healthy.
In case the old leader becomes alive again, it'll become a follower.
Dkron is a distributed cron drop-in replacement, easy to setup and fault tolerant with focus in:
- Easy: Easy to use with a great UI
- Reliable: Completely fault tolerant
- Highly scalable: Able to handle high volumes of scheduled jobs and thousands of nodes
Dkron is written in Go and leverages the power of distributed key value stores and Serf for providing fault tolerance, reliability and scalability while remaining simple and easily installable.
Dkron is inspired by the google whitepaper Reliable Cron across the Planet
Dkron runs on Linux, OSX and Windows. It can be used to run scheduled commands on a server cluster using any combination of servers for each job. It has no single points of failure due to the use of the fault tolerant distributed databases and can work at large scale thanks to the efficient and lightweight gossip protocol.
Dkron uses the efficient and lightweight gossip protocol underneath to communicate with nodes. Failure notification and task handling are run efficiently across an entire cluster of any size.
System Architecture
Dkron utilizes a server-agent architecture where multiple agents can form a cluster for high availability. Here's a high-level overview of how Dkron works: