Task coordination system
Requirements
only management (submission, queuing, monitoring), not execution (like beanstalkd, and unlike resque);
- (durable) persistence -- all the task information (inputs, outputs, meta-data, etc.) must be stored in a crash-resistant data store;
high throughput -- around a thousand operations per second (not including the execution time) (like beanstalkd);
introspection -- query the task states, overall statistics, etc (like kue or resque);
- both push and pull execution requests (like GAE task queues);
multiple protocol support -- for example at RESTful and beanstalkd or memcached protocol;
- more than just queuing, for example it must support the following:
- simple mutual exclusion mechanisms;
- simple locality delivery (like consistent hashing);
- throttling execution;
- retries with backoff algorithm;
- (of course) priorities per task, and priorities per queue; (maybe also priorities per scope?;)
Design
Concepts
- scope -- similar with the concept of a database, which plays the following roles:
- this is the highest level where all logic (mutual exclusion, throttling, etc.) can be applied;
- authentication and authorization applies at this level;
- allows sharing the same system for multiple disjoint applications;
- could be used for partitioning, sharding, etc.;
- task type -- similar with the concept of a table or class, which:
- describes inputs / outputs schema (if any);
- describes the rules or plugins that apply for a certain task;
task queue -- similar with a "queue" in resque or AMQP, or a "tube" in beanstalk:
- it is one of the channels through which executers pull (or are pushed) tasks;
- these queues are either "real" (as described above) or "synthetic" grouping tasks by their state;
- task life cycle -- it is a standard FSM that specifies the allowed transitions and actions;
- task coordinators -- are modules (plugins) that:
- control the life cycle (in the admitted bounds) of a certain task;
- provide feedback, hooks, notifications, and so on, for tasks;
Links
Please see the dedicated page.