Verk is a job processing system written in Elixir that uses Redis to store jobs. It uses Lists, Sorted Sets, Sets and String data structures from Redis to execute & retry jobs and detect node failure if multiple Verk instances are running.
Verk 2.0 will be using Redis Streams to solve a few issues that are hard to solve with the current data structure that represents a queue.
Which issues?
I will precisely outline these issues below:
Verk Web (our dashboard) has to run alongside a Verk instance to peek the current running jobs of a queue. This is by design as it would involve a lot of extra work to keep this inside Redis. It is also a very common request:
With a Redis Stream we can efficiently achieve this without extra work!
Verk aggressively pulls jobs in a single round trip. This is a problem as there’s no guarantee that multiple instances requesting the same data will get a fair amount of jobs. This could be achieved without a Redis Stream but it wouldn’t be trivial.
With a Redis Stream we can efficiently achieve this without extra work!
Verk uses Redis to keeps track of which instances are consuming which queues. To do this Verk has to store:
verk:node:#{id}
verk_nodes
;verk:node:#{id}:queues
Then every Verk instance keeps looking for instances that are inside the verk_nodes
set and do not have its key verk:node:#{id}
defined. If this happens we consider that the instance is dead as it’s incapable of keeping this key defined and it has expired. More info can be found here and here.
Redis Streams don’t provide this feature completely but it does 90% of the job! The next section will outline exactly what this means for each of these issues.
“The Stream is a new data type introduced with Redis 5.0, which models a log data structure.“
If you want to know exactly how Redis Streams work check out this blog post: https://redis.io/topics/streams-intro
Streams comes with excellent guarantees to represent a Verk Queue. Streams can be consumed by groups of consumers which automatically tracks which consumer was assigned which stream items.
Verk will be a Redis Stream group and each Verk instance will be a Redis Stream consumer of Redis Stream items (jobs).
Now I will explain each issue that I outlined in the previous section.
A Redis Stream keeps track of all pending items very efficiently. Pending items are items that were assigned to a specific consumer. We simply need to list all pending items!
When you have multiple consumers waiting for Redis Stream items to be processed, Redis will ensure that each consumer will receive items following a FIFO order: whichever consumer requested data first will receive first then the next etc.
A Redis Stream keeps track of consumers that requested the items. We can use this fact to check if an instance is still using that item. Verk still has to store one key per instance with an expiration that every instance keeps reseting it from time to time. Keys: verk:node:#{id}
From time to time each instance will look for pending items from the Redis Stream and check if the instance is alive (if the key verk:node:#{id}
is set). If the key is not set then we should claim those items to be processed again.
There’s a branch with all the changes listed above: https://github.com/edgurgel/verk/compare/streams. It should be merged to master soon!