Verk 2.0 & Redis Streams

Verk

Background

Verk is a job processing system written in Elixir that uses Redis to store jobs. It uses Lists, Sorted Sets, Sets and String data structures from Redis to execute & retry jobs and detect node failure if multiple Verk instances are running.

Verk 2.0 will be using Redis Streams to solve a few issues that are hard to solve with the current data structure that represents a queue.

Which issues?

  • Verk Web can’t see all the jobs that are running for a queue;
  • Fair distribution of jobs across Verk instances is hard;
  • Node failure detection is inefficient.

I will precisely outline these issues below:

Verk Web can’t see all the jobs that are running for a queue

Verk Web (our dashboard) has to run alongside a Verk instance to peek the current running jobs of a queue. This is by design as it would involve a lot of extra work to keep this inside Redis. It is also a very common request:

With a Redis Stream we can efficiently achieve this without extra work!

Fair distribution of jobs across Verk instances is hard

Verk aggressively pulls jobs in a single round trip. This is a problem as there’s no guarantee that multiple instances requesting the same data will get a fair amount of jobs. This could be achieved without a Redis Stream but it wouldn’t be trivial.

With a Redis Stream we can efficiently achieve this without extra work!

Node Failure detection is inefficient

Verk uses Redis to keeps track of which instances are consuming which queues. To do this Verk has to store:

  • One key per instance with an expiration that every instance keeps reseting it from time to time. It’s the way that an instance can say: “I’m alive!!!”. Keys: verk:node:#{id}
  • A single Redis Set that include all running nodes. Key: verk_nodes;
  • One Redis Set per instance that include all queues for this instance. Keys: verk:node:#{id}:queues

Then every Verk instance keeps looking for instances that are inside the verk_nodes set and do not have its key verk:node:#{id} defined. If this happens we consider that the instance is dead as it’s incapable of keeping this key defined and it has expired. More info can be found here and here.

Redis Streams don’t provide this feature completely but it does 90% of the job! The next section will outline exactly what this means for each of these issues.

Verk 2.0

“The Stream is a new data type introduced with Redis 5.0, which models a log data structure.“

If you want to know exactly how Redis Streams work check out this blog post: https://redis.io/topics/streams-intro

Streams comes with excellent guarantees to represent a Verk Queue. Streams can be consumed by groups of consumers which automatically tracks which consumer was assigned which stream items.

Verk will be a Redis Stream group and each Verk instance will be a Redis Stream consumer of Redis Stream items (jobs).

Now I will explain each issue that I outlined in the previous section.

Verk Web can’t see all the jobs that are running for a queue

A Redis Stream keeps track of all pending items very efficiently. Pending items are items that were assigned to a specific consumer. We simply need to list all pending items!

Fair distribution of jobs across Verk instances is hard

When you have multiple consumers waiting for Redis Stream items to be processed, Redis will ensure that each consumer will receive items following a FIFO order: whichever consumer requested data first will receive first then the next etc.

Node Failure detection is inefficient

A Redis Stream keeps track of consumers that requested the items. We can use this fact to check if an instance is still using that item. Verk still has to store one key per instance with an expiration that every instance keeps reseting it from time to time. Keys: verk:node:#{id}

From time to time each instance will look for pending items from the Redis Stream and check if the instance is alive (if the key verk:node:#{id} is set). If the key is not set then we should claim those items to be processed again.

Next steps

There’s a branch with all the changes listed above: https://github.com/edgurgel/verk/compare/streams. It should be merged to master soon!

TL;DR Breaking Changes

  • Redis 5.0+
  • Stream instead of List for queues
  • Restoring jobs from a node that is not processing any jobs and left jobs pending will now enqueue them at the end of the queue instead of the beginning. That’s a very minor change but I listed here just in case.
Posted on June 9, 2019