Home

Awesome

Solid Queue

Solid Queue is a DB-based queuing backend for Active Job, designed with simplicity and performance in mind.

Besides regular job enqueuing and processing, Solid Queue supports delayed jobs, concurrency controls, pausing queues, numeric priorities per job, priorities by queue order, and bulk enqueuing (enqueue_all for Active Job's perform_all_later).

Solid Queue can be used with SQL databases such as MySQL, PostgreSQL or SQLite, and it leverages the FOR UPDATE SKIP LOCKED clause, if available, to avoid blocking and waiting on locks when polling jobs. It relies on Active Job for retries, discarding, error handling, serialization, or delays, and it's compatible with Ruby on Rails multi-threading.

Installation

Solid Queue is configured by default in new Rails 8 applications. But if you're running an earlier version, you can add it manually following these steps:

  1. bundle add solid_queue
  2. bin/rails solid_queue:install

This will configure Solid Queue as the production Active Job backend, create config/solid_queue.yml, and create the db/queue_schema.rb.

You will then have to add the configuration for the queue database in config/database.yml. If you're using sqlite, it'll look like this:

production:
  primary:
    <<: *default
    database: storage/production.sqlite3
  queue:
    <<: *default
    database: storage/production_queue.sqlite3
    migrations_paths: db/queue_migrate

...or if you're using MySQL/PostgreSQL/Trilogy:

production:
  primary: &primary_production
    <<: *default
    database: app_production
    username: app
    password: <%= ENV["APP_DATABASE_PASSWORD"] %>
  queue:
    <<: *primary_production
    database: app_production_queue
    migrations_paths: db/queue_migrate

Note: Calling bin/rails solid_queue:install will automatically add config.solid_queue.connects_to = { database: { writing: :queue } } to config/environments/production.rb, so no additional configuration is needed there (although you must make sure that you use the queue name in database.yml for this to match!). But if you want to use Solid Queue in a different environment (like staging or even development), you'll have to manually add that config.solid_queue.connects_to line to the respective environment file. And, as always, make sure that the name you're using for the database in config/database.yml matches the name you use in config.solid_queue.connects_to.

Then run db:prepare in production to ensure the database is created and the schema is loaded.

Now you're ready to start processing jobs by running bin/jobs on the server that's doing the work. This will start processing jobs in all queues using the default configuration. See below to learn more about configuring Solid Queue.

For small projects, you can run Solid Queue on the same machine as your webserver. When you're ready to scale, Solid Queue supports horizontal scaling out-of-the-box. You can run Solid Queue on a separate server from your webserver, or even run bin/jobs on multiple machines at the same time. Depending on the configuration, you can designate some machines to run only dispatchers or only workers. See the configuration section for more details on this.

Single database configuration

It's also possibile to use one single database for both production data:

  1. Shovel db/queue_schema.rb into a normal migration and delete db/queue_schema.rb
  2. Remove config.solid_queue.connects_to from production.rb
  3. Migrate your database. You are ready to run bin/jobs

You won't have multiple databases, so database.yml doesn't need to have primary and queue database.

Incremental adoption

If you're planning to adopt Solid Queue incrementally by switching one job at the time, you can do so by leaving the config.active_job.queue_adapter set to your old backend, and then set the queue_adapter directly in the jobs you're moving:

# app/jobs/my_job.rb

class MyJob < ApplicationJob
  self.queue_adapter = :solid_queue
  # ...
end

High performance requirements

Solid Queue was designed for the highest throughput when used with MySQL 8+ or PostgreSQL 9.5+, as they support FOR UPDATE SKIP LOCKED. You can use it with older versions, but in that case, you might run into lock waits if you run multiple workers for the same queue. You can also use it with SQLite on smaller applications.

Configuration

Workers and dispatchers

We have three types of actors in Solid Queue:

Solid Queue's supervisor will fork a separate process for each supervised worker/dispatcher.

By default, Solid Queue will try to find your configuration under config/solid_queue.yml, but you can set a different path using the environment variable SOLID_QUEUE_CONFIG. This is what this configuration looks like:

production:
  dispatchers:
    - polling_interval: 1
      batch_size: 500
      concurrency_maintenance_interval: 300
  workers:
    - queues: "*"
      threads: 3
      polling_interval: 2
    - queues: [ real_time, background ]
      threads: 5
      polling_interval: 0.1
      processes: 3

Everything is optional. If no configuration at all is provided, Solid Queue will run with one dispatcher and one worker with default settings. If you want to run only dispatchers or workers, you just need to include that section alone in the configuration. For example, with the following configuration:

production:
  dispatchers:
    - polling_interval: 1
      batch_size: 500
      concurrency_maintenance_interval: 300

the supervisor will run 1 dispatcher and no workers.

Here's an overview of the different options:

Queue order and priorities

As mentioned above, if you specify a list of queues for a worker, these will be polled in the order given, such as for the list real_time,background, no jobs will be taken from background unless there aren't any more jobs waiting in real_time.

Active Job also supports positive integer priorities when enqueuing jobs. In Solid Queue, the smaller the value, the higher the priority. The default is 0.

This is useful when you run jobs with different importance or urgency in the same queue. Within the same queue, jobs will be picked in order of priority, but in a list of queues, the queue order takes precedence, so in the previous example with real_time,background, jobs in the real_time queue will be picked before jobs in the background queue, even if those in the background queue have a higher priority (smaller value) set.

We recommend not mixing queue order with priorities but either choosing one or the other, as that will make job execution order more straightforward for you.

Threads, processes and signals

Workers in Solid Queue use a thread pool to run work in multiple threads, configurable via the threads parameter above. Besides this, parallelism can be achieved via multiple processes on one machine (configurable via different workers or the processes parameter above) or by horizontal scaling.

The supervisor is in charge of managing these processes, and it responds to the following signals:

When receiving a QUIT signal, if workers still have jobs in-flight, these will be returned to the queue when the processes are deregistered.

If processes have no chance of cleaning up before exiting (e.g. if someone pulls a cable somewhere), in-flight jobs might remain claimed by the processes executing them. Processes send heartbeats, and the supervisor checks and prunes processes with expired heartbeats, which will release any claimed jobs back to their queues. You can configure both the frequency of heartbeats and the threshold to consider a process dead. See the section below for this.

Database configuration

You can configure the database used by Solid Queue via the config.solid_queue.connects_to option in the config/application.rb or config/environments/production.rb config files. By default, a single database is used for both writing and reading called queue to match the database configuration you set up during the install.

All the options available to Active Record for multiple databases can be used here.

Lifecycle hooks

In Solid queue, you can hook into two different points in the supervisor's life:

And into two different points in a worker's life:

You can use the following methods with a block to do this:

SolidQueue.on_start
SolidQueue.on_stop

SolidQueue.on_worker_start
SolidQueue.on_worker_stop

For example:

SolidQueue.on_start { start_metrics_server }
SolidQueue.on_stop { stop_metrics_server }

These can be called several times to add multiple hooks, but it needs to happen before Solid Queue is started. An initializer would be a good place to do this.

Other configuration settings

Note: The settings in this section should be set in your config/application.rb or your environment config like this: config.solid_queue.silence_polling = true

There are several settings that control how Solid Queue works that you can set as well:

Errors when enqueuing

Solid Queue will raise a SolidQueue::Job::EnqueueError for any Active Record errors that happen when enqueuing a job. The reason for not raising ActiveJob::EnqueueError is that this one gets handled by Active Job, causing perform_later to return false and set job.enqueue_error, yielding the job to a block that you need to pass to perform_later. This works very well for your own jobs, but makes failure very hard to handle for jobs enqueued by Rails or other gems, such as Turbo::Streams::BroadcastJob or ActiveStorage::AnalyzeJob, because you don't control the call to perform_later in that cases.

In the case of recurring tasks, if such error is raised when enqueuing the job corresponding to the task, it'll be handled and logged but it won't bubble up.

Concurrency controls

Solid Queue extends Active Job with concurrency controls, that allows you to limit how many jobs of a certain type or with certain arguments can run at the same time. When limited in this way, jobs will be blocked from running, and they'll stay blocked until another job finishes and unblocks them, or after the set expiry time (concurrency limit's duration) elapses. Jobs are never discarded or lost, only blocked.

class MyJob < ApplicationJob
  limits_concurrency to: max_concurrent_executions, key: ->(arg1, arg2, **) { ... }, duration: max_interval_to_guarantee_concurrency_limit, group: concurrency_group

  # ...

When a job includes these controls, we'll ensure that, at most, the number of jobs (indicated as to) that yield the same key will be performed concurrently, and this guarantee will last for duration for each job enqueued. Note that there's no guarantee about the order of execution, only about jobs being performed at the same time (overlapping).

For example:

class DeliverAnnouncementToContactJob < ApplicationJob
  limits_concurrency to: 2, key: ->(contact) { contact.account }, duration: 5.minutes

  def perform(contact)
    # ...

Where contact and account are ActiveRecord records. In this case, we'll ensure that at most two jobs of the kind DeliverAnnouncementToContact for the same account will run concurrently. If, for any reason, one of those jobs takes longer than 5 minutes or doesn't release its concurrency lock within 5 minutes of acquiring it, a new job with the same key might gain the lock.

Let's see another example using group:

class Box::MovePostingsByContactToDesignatedBoxJob < ApplicationJob
  limits_concurrency key: ->(contact) { contact }, duration: 15.minutes, group: "ContactActions"

  def perform(contact)
    # ...
class Bundle::RebundlePostingsJob < ApplicationJob
  limits_concurrency key: ->(bundle) { bundle.contact }, duration: 15.minutes, group: "ContactActions"

  def perform(bundle)
    # ...

In this case, if we have a Box::MovePostingsByContactToDesignatedBoxJob job enqueued for a contact record with id 123 and another Bundle::RebundlePostingsJob job enqueued simultaneously for a bundle record that references contact 123, only one of them will be allowed to proceed. The other one will stay blocked until the first one finishes (or 15 minutes pass, whatever happens first).

Note that the duration setting depends indirectly on the value for concurrency_maintenance_interval that you set for your dispatcher(s), as that'd be the frequency with which blocked jobs are checked and unblocked. In general, you should set duration in a way that all your jobs would finish well under that duration and think of the concurrency maintenance task as a failsafe in case something goes wrong.

Jobs are unblocked in order of priority but queue order is not taken into account for unblocking jobs. That means that if you have a group of jobs that share a concurrency group but are in different queues, or jobs of the same class that you enqueue in different queues, the queue order you set for a worker is not taken into account when unblocking blocked ones. The reason is that a job that runs unblocks the next one, and the job itself doesn't know about a particular worker's queue order (you could even have different workers with different queue orders), it can only know about priority. Once blocked jobs are unblocked and available for polling, they'll be picked up by a worker following its queue order.

Finally, failed jobs that are automatically or manually retried work in the same way as new jobs that get enqueued: they get in the queue for gaining the lock, and whenever they get it, they'll be run. It doesn't matter if they had gained the lock already in the past.

Failed jobs and retries

Solid Queue doesn't include any automatic retry mechanism, it relies on Active Job for this. Jobs that fail will be kept in the system, and a failed execution (a record in the solid_queue_failed_executions table) will be created for these. The job will stay there until manually discarded or re-enqueued. You can do this in a console as:

failed_execution = SolidQueue::FailedExecution.find(...) # Find the failed execution related to your job
failed_execution.error # inspect the error

failed_execution.retry # This will re-enqueue the job as if it was enqueued for the first time
failed_execution.discard # This will delete the job from the system

However, we recommend taking a look at mission_control-jobs, a dashboard where, among other things, you can examine and retry/discard failed jobs.

Puma plugin

We provide a Puma plugin if you want to run the Solid Queue's supervisor together with Puma and have Puma monitor and manage it. You just need to add

plugin :solid_queue

to your puma.rb configuration.

Recurring tasks

Solid Queue supports defining recurring tasks that run at specific times in the future, on a regular basis like cron jobs. These are managed by dispatcher processes and as such, they can be defined in the dispatcher's configuration like this:

  dispatchers:
    - polling_interval: 1
      batch_size: 500
      recurring_tasks:
        my_periodic_job:
          class: MyJob
          args: [ 42, { status: "custom_status" } ]
          schedule: every second

recurring_tasks is a hash/dictionary, and the key will be the task key internally. Each task needs to have a class, which will be the job class to enqueue, and a schedule. The schedule is parsed using Fugit, so it accepts anything that Fugit accepts as a cron. You can also provide arguments to be passed to the job, as a single argument, a hash, or an array of arguments that can also include kwargs as the last element in the array.

The job in the example configuration above will be enqueued every second as:

MyJob.perform_later(42, status: "custom_status")

Tasks are enqueued at their corresponding times by the dispatcher that owns them, and each task schedules the next one. This is pretty much inspired by what GoodJob does.

It's possible to run multiple dispatchers with the same recurring_tasks configuration. To avoid enqueuing duplicate tasks at the same time, an entry in a new solid_queue_recurring_executions table is created in the same transaction as the job is enqueued. This table has a unique index on task_key and run_at, ensuring only one entry per task per time will be created. This only works if you have preserve_finished_jobs set to true (the default), and the guarantee applies as long as you keep the jobs around.

Finally, it's possible to configure jobs that aren't handled by Solid Queue. That is, you can have a job like this in your app:

class MyResqueJob < ApplicationJob
  self.queue_adapter = :resque

  def perform(arg)
    # ..
  end
end

You can still configure this in Solid Queue:

  dispatchers:
    - recurring_tasks:
        my_periodic_resque_job:
          class: MyResqueJob
          args: 22
          schedule: "*/5 * * * *"

and the job will be enqueued via perform_later so it'll run in Resque. However, in this case we won't track any solid_queue_recurring_execution record for it and there won't be any guarantees that the job is enqueued only once each time.

Inspiration

Solid Queue has been inspired by resque and GoodJob. We recommend checking out these projects as they're great examples from which we've learnt a lot.

License

The gem is available as open source under the terms of the MIT License.