Home

Awesome

PluckInBatches

Build Status

ActiveRecord comes with find_each / find_in_batches / in_batches methods to batch process records from a database. ActiveRecord also has the pluck method which allows the selection of a set of fields without pulling the entire record into memory.

This gem combines these ideas and provides pluck_each and pluck_in_batches methods to allow batch processing of plucked fields from the database.

It performs half of the number of SQL queries, allocates up to half of the memory and is up to 2x faster (or more, depending on how far is your database from the application) than the available alternative:

# Before
User.in_batches do |batch| # or .find_in_batches, or .select(:email).find_each etc
  emails = batch.pluck(:emails)
  # do something with emails
end

# Now, using this gem (up to 2x faster)
User.pluck_in_batches(:email) do |emails|
  # do something with emails
end

Note: You may also find sidekiq-iteration useful when iterating over large collections in Sidekiq jobs.

Requirements

If you need support for older versions, open an issue.

Installation

Add this line to your application's Gemfile:

gem 'pluck_in_batches'

And then execute:

$ bundle

Or install it yourself as:

$ gem install pluck_in_batches

Usage

pluck_each

Behaves similarly to find_each ActiveRecord's method, but yields each set of values corresponding to the specified columns.

# Single column
User.where(active: true).pluck_each(:email) do |email|
  # do something with email
end

# Multiple columns
User.where(active: true).pluck_each(:id, :email) do |id, email|
  # do something with id and email
end

pluck_in_batches

Behaves similarly to in_batches ActiveRecord's method, but yields each batch of values corresponding to the specified columns.

# Single column
User.where("age > 21").pluck_in_batches(:email) do |emails|
  jobs = emails.map { |email| PartyReminderJob.new(email) }
  ActiveJob.perform_all_later(jobs)
end

# Multiple columns
User.pluck_in_batches(:name, :email).with_index do |group, index|
  puts "Processing group ##{index}"
  jobs = group.map { |name, email| PartyReminderJob.new(name, email) }
  ActiveJob.perform_all_later(jobs)
end

# Custom arel column
User.pluck_in_batches(:id, Arel.sql("json_extract(users.metadata, '$.rank')")).with_index do |group, index|
  # ...
end

Both methods support the following configuration options:

Development

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/fatkodima/pluck_in_batches.

License

The gem is available as open source under the terms of the MIT License.