Home

Awesome

Email Address

Gem Version CI Build Code Climate

The email_address gem provides a ruby language library for working with email addresses.

By default, it validates against conventional usage, the format preferred for user email addresses. It can be configured to validate against RFC "Standard" formats, common email service provider formats, and perform DNS validation.

Using email_address to validate user email addresses results in fewer "false positives" due to typing errors and gibberish data. It validates syntax more strictly for popular email providers, and can deal with gmail's "optional dots" in addresses.

It provides Active Record (Rails) extensions, including an address validator and attributes API custom datatypes.

Requires Ruby 2.0 or later.

Looking for a Javascript version of this library? Check out the email_address npm module.

Quick Start

To quickly validate email addresses, use the valid? and error helpers. valid? returns a boolean, and error returns nil if valid, otherwise a basic error message.

EmailAddress.valid? "allen@google.com" #=> true
EmailAddress.error "allen@bad-d0main.com" #=> "Invalid Host/Domain Name"

EmailAddress deeply validates your email addresses. It checks:

It does not check:

By default, MX records are required in DNS. MX or "mail exchanger" records tell where to deliver email for the domain. Many domains run their website on one provider (ISP, Heroku, etc.), and email on a different provider (such as G Suite). Note that example.com, while a valid domain name, does not have MX records.

EmailAddress.valid? "allen@example.com" #=> false
EmailAddress.valid? "allen@example.com", host_validation: :syntax #=> true

Most mail servers do not yet support Unicode mailboxes, so the default here is ASCII.

EmailAddress.error "Pelé@google.com" #=> "Invalid Recipient/Mailbox"
EmailAddress.valid? "Pelé@google.com", local_encoding: :unicode #=> true

Background

The email address specification is complex and often not what you want when working with personal email addresses in applications. This library introduces terms to distinguish types of email addresses.

Other terms:

Wikipedia has a great article on Email Addresses, much more readable than the section within RFC 5322

Avoiding the Bad Parts of RFC Specification

Following the RFC specification sounds like a good idea, until you learn about all the madness contained therein. This library can validate the RFC syntax, but this is never useful, especially when validating user email address submissions. By default, it validates to the conventional format.

Here are a few parts of the RFC specification you should avoid:

Installation With Rails or Bundler

If you are using Rails or a project with Bundler, add this line to your application's Gemfile:

gem 'email_address'

And then execute:

$ bundle

Installation Without Bundler

If you are not using Bundler, you need to install the gem yourself.

$ gem install email_address

Require the gem inside your script.

require 'rubygems'
require 'email_address'

Usage

Use EmailAddress to do transformations and validations. You can also instantiate an object to inspect the address.

These top-level helpers return edited email addresses and validation check.

address = "Clark.Kent+scoops@gmail.com"
EmailAddress.valid?(address)    #=> true
EmailAddress.normal(address)    #=> "clark.kent+scoops@gmail.com"
EmailAddress.canonical(address) #=> "clarkkent@gmail.com"
EmailAddress.reference(address) #=> "c5be3597c391169a5ad2870f9ca51901"
EmailAddress.redact(address)    #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
EmailAddress.munge(address)     #=> "cl*****@gm*****"
EmailAddress.matches?(address, 'google') #=> 'google' (true)
EmailAddress.error("#bad@example.com") #=> "Invalid Mailbox"

Or you can create an instance of the email address to work with it.

email = EmailAddress.new(address) #=> #<EmailAddress::Address:0x007fe6ee150540 ...>
email.normal        #=> "clark.kent+scoops@gmail.com"
email.canonical     #=> "clarkkent@gmail.com"
email.original      #=> "Clark.Kent+scoops@gmail.com"
email.valid?        #=> true

Here are some other methods that are available.

email.redact        #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"
email.sha1          #=> "bea3f3560a757f8142d38d212a931237b218eb5e"
email.sha256        #=> "9e2a0270f2d6778e5f647fc9eaf6992705ca183c23d1ed1166586fd54e859f75"
email.md5           #=> "c5be3597c391169a5ad2870f9ca51901"
email.host_name     #=> "gmail.com"
email.provider      #=> :google
email.mailbox       #=> "clark.kent"
email.tag           #=> "scoops"

email.host.exchangers.first[:ip] #=> "2a00:1450:400b:c02::1a"
email.host.txt_hash #=> {:v=>"spf1", :redirect=>"\_spf.google.com"}

EmailAddress.normal("HIRO@こんにちは世界.com")
                    #=> "hiro@xn--28j2a3ar1pp75ovm7c.com"
EmailAddress.normal("hiro@xn--28j2a3ar1pp75ovm7c.com", host_encoding: :unicode)
                    #=> "hiro@こんにちは世界.com"

As of release 0.1.17, exchanger_match is no longer used for host provider determination, which designated the set of rules for that domain. Sometimes, as in Google-hosted domains, the address rules are different, notably the optional dots in mailboxes for gmail.com accounts do not apply to other private domains hosted at google.

To access the provider service, you can now call:

EmailAddress.new("user@hosteddomain.com").host.hosted_provider

Rails Validator

For Rails' ActiveRecord classes, EmailAddress provides an ActiveRecordValidator. Specify your email address attributes with field: :user_email, or fields: [:email1, :email2]. If neither is given, it assumes to use the email or email_address attribute.

class User < ActiveRecord::Base
  validates_with EmailAddress::ActiveRecordValidator, field: :email
end

Rails I18n

Copy and adapt lib/email_address/messages.yaml into your locales and create an after initialization callback:

# config/initializers/email_address.rb

Rails.application.config.after_initialize do
  I18n.available_locales.each do |locale|
    translations = I18n.t(:email_address, locale: locale)

    next unless translations.is_a? Hash

    EmailAddress::Config.error_messages translations.transform_keys(&:to_s), locale.to_s
  end
end

Rails Email Address Type Attribute

Initial support is provided for Active Record 5.0 attributes API.

First, you need to register the type in config/initializers/email_address.rb along with any global configurations you want.

ActiveRecord::Type.register(:email_address, EmailAddress::EmailAddressType)
ActiveRecord::Type.register(:canonical_email_address,
                            EmailAddress::CanonicalEmailAddressType)

Assume the Users table contains the columns "email" and "canonical_email". We want to normalize the address in "email" and store the canonical/unique version in "canonical_email". This code will set the canonical_email when the email attribute is assigned. With the canonical_email column, we can look up the User, even it the given email address didn't exactly match the registered version.

class User < ApplicationRecord
  attribute :email, :email_address
  attribute :canonical_email, :canonical_email_address

  validates_with EmailAddress::ActiveRecordValidator,
                 fields: %i(email canonical_email)

  def email=(email_address)
    self[:canonical_email] = email_address
    self[:email] = email_address
  end

  def self.find_by_email(email)
    user   = self.find_by(email: EmailAddress.normal(email))
    user ||= self.find_by(canonical_email: EmailAddress.canonical(email))
    user ||= self.find_by(canonical_email: EmailAddress.redacted(email))
    user
  end

  def redact!
    self[:canonical_email] = EmailAddress.redact(self.canonical_email)
    self[:email]           = self[:canonical_email]
  end
end

Here is how the User model works:

user = User.create(email:"Pat.Smith+registrations@gmail.com")
user.email           #=> "pat.smith+registrations@gmail.com"
user.canonical_email #=> "patsmith@gmail.com"
User.find_by_email("PAT.SMITH@GMAIL.COM")
                     #=> #<User email="pat.smith+registrations@gmail.com">

The find_by_email method looks up a given email address by the normalized form (lower case), then by the canonical form, then finally by the redacted form.

Validation

The only true validation is to send a message to the email address and have the user (or process) verify it has been received. Syntax checks help prevent erroneous input. Even sent messages can be silently dropped, or bounced back after acceptance. Conditions such as a "Mailbox Full" can mean the email address is known, but abandoned.

There are different levels of validations you can perform. By default, it will validate to the "Provider" (if known), or "Conventional" format defined as the "default" provider. You may pass a a list of parameters to select which syntax and network validations to perform.

Comparison

You can compare email addresses:

e1 = EmailAddress.new("Clark.Kent@Gmail.com")
e2 = EmailAddress.new("clark.kent+Superman@Gmail.com")
e3 = EmailAddress.new(e2.redact)
e1.to_s           #=> "clark.kent@gmail.com"
e2.to_s           #=> "clark.kent+superman@gmail.com"
e3.to_s           #=> "{bea3f3560a757f8142d38d212a931237b218eb5e}@gmail.com"

e1 == e2          #=> false (Matches by normalized address)
e1.same_as?(e2)   #=> true  (Matches as canonical address)
e1.same_as?(e3)   #=> true  (Matches as redacted address)
e1 < e2           #=> true  (Compares using normalized address)

Matching

Matching addresses by simple patterns:

Usage:

e = EmailAddress.new("Clark.Kent@Gmail.com")
e.matches?("gmail.com") #=> true
e.matches?("google")    #=> true
e.matches?(".org")      #=> false
e.matches?("g*com")     #=> true
e.matches?("gmail.")    #=> true
e.matches?("*kent*@")   #=> true

Configuration

You can pass an options hash on the .new() and helper class methods to control how the library treats that address. These can also be configured during initialization by provider and default (see below).

EmailAddress.new("clark.kent@gmail.com",
                 host_validation: :syntax, host_encoding: :unicode)

Globally, you can change and query configuration options:

EmailAddress::Config.setting(:host_validation, :mx)
EmailAddress::Config.setting(:host_validation) #=> :mx

Or set multiple settings at once:

EmailAddress::Config.configure(local_downcase: false, host_validation: :syntax)

You can add special rules by domain or provider. It takes the options above and adds the :domain_match and :exchanger_match rules.

EmailAddress.define_provider('google',
  domain_match:      %w(gmail.com googlemail.com),
  exchanger_match:   %w(google.com), # Requires host_validation==:mx
  local_size:        5..64,
  mailbox_canonical: ->(m) {m.gsub('.','')})

The library ships with the most common set of provider rules. It is not meant to house a database of all providers, but a separate email_address-providers gem may be created to hold this data for those who need more complete rules.

Personal and Corporate email systems are not intended for either solution. Any of these email systems may be configured locally.

Pre-configured email address providers include: Google (gmail), AOL, MSN (hotmail, live, outlook), and Yahoo. Any address not matching one of those patterns use the "default" provider rule set. Exchanger matches matches against the Mail Exchanger (SMTP receivers) hosts defined in DNS. If you specify an exchanger pattern, but requires a DNS MX lookup.

For Rails application, create an initializer file with your default configuration options. EmailAddress::Config.setting takes a single setting name and value, while EmailAddress::Config.configure takes a hash of multiple settings.

# ./config/initializers/email_address.rb
EmailAddress::Config.setting( :local_format, :relaxed )
EmailAddress::Config.configure( local_format: :relaxed, ... )
EmailAddress::Config.provider(:github,
       host_match: %w(github.com), local_format: :standard)

Override Error Messaegs

You can override the default error messages as follows:

EmailAddress::Config.error_messages({
  invalid_address:    "Invalid Email Address",
  invalid_mailbox:    "Invalid Recipient/Mailbox",
  invalid_host:       "Invalid Host/Domain Name",
  exceeds_size:       "Address too long",
  not_allowed:        "Address is not allowed",
  incomplete_domain:  "Domain name is incomplete"}, 'en')

Note: Release 0.1.14 fixed setting error messages by locale. Also, it will accept a ruby "collected" hash as before,

Full translation support would be ideal though.

Available Configuration Settings

For local part configuration:

For the mailbox (AKA account, role), without the tag

For provider rules to match to domain names and Exchanger hosts The value is an array of match tokens.

Namespace conflict resolution

If your application already uses the EmailAddress class name, it's possible to create an alias prior to loading your code:

For a Rails application, you can do this in config/application.rb after the Bundler.require line, usually:

Bundler.require(*Rails.groups)

Add these lines immediately after that point:

EmailAddressValidator = EmailAddress
Object.send(:remove_const, :EmailAddress)

Then your application loads with your EmailAddress class. You may then use this gem with EmailAddressValidator or whatever name you gave it above:

EmailAddressValidator.valid?("clark.kent@gmail.com") # => true

Notes

Internationalization

The industry is moving to support Unicode characters in the local part of the email address. Currently, SMTP supports only 7-bit ASCII, but a new SMTPUTF8 standard is available, but not yet widely implemented. To work properly, global Email systems must be converted to UTF-8 encoded databases and upgraded to the new email standards.

The problem with i18n email addresses is that support outside of the given locale becomes hard to enter addresses on keyboards for another locale. Because of this, internationalized local parts are not yet supported by default. They are more likely to be erroneous.

Proper personal identity can still be provided using MIME Encoded-Words in Email headers.

Email Addresses as Sensitive Data

Like Social Security and Credit Card Numbers, email addresses are becoming more important as a personal identifier on the internet. Increasingly, we should treat email addresses as sensitive data. If your site/database becomes compromised by hackers, these email addresses can be stolen and used to spam your users and to try to gain access to their accounts. You should not be storing passwords in plain text; perhaps you don't need to store email addresses un-encoded either.

Consider this: upon registration, store the redacted email address for the user, and of course, the salted, encrypted password. When the user logs in, compute the redacted email address from the user-supplied one and look up the record. Store the original address in the session for the user, which goes away when the user logs out.

Sometimes, users demand you strike their information from the database. Instead of deleting their account, you can "redact" their email address, retaining the state of the account to prevent future access. Given the original email address again, the redacted account can be identified if necessary.

Because of these use cases, the redact method on the email address instance has been provided.

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Project

This project lives at https://github.com/afair/email_address/

Authors