Home

Awesome

Multibases

Build Status Gem Version MIT license Maintainability

Multibase is a protocol for disambiguating the encoding of base-encoded (e.g., base32, base64, base58, etc.) binary appearing in text.

Multibases is the ruby implementation of multiformats/multibase.

This gem can be used both for encoding into or decoding from multibase packed strings, as well as serve as a general purpose library to do BaseX encoding and decoding without adding the prefix.

🙌🏽 This is called multibases instead of the singular form, to stay consistent with the multihashes gem, which was forced to take a different name has multihash was already taken, which is also the case for multibase and others. In the future, this might be renamed to multiformats-base, with a backwards-compatible interface.

Installation

Add this line to your application's Gemfile:

gem 'multibases'

or alternatively if you would like to bring your own engines and not load any of the built-in ones:

gem 'multibases', require: 'multibases/bare'

And then execute:

$ bundle

Or install it yourself as:

$ gem install multibases

Usage

This is a low-level library, but high level implementations are provided. You can also bring your own encoder/decoder. The most important methods are:

encoded = Multibases.encode('base2', 'mb')
# => #<struct Multibases::Encoded
#             code="0", encoding="base2", length=16,
#             data=[Multibases::EncodedByteArray "0110110101100010"]>

encoded.pack
# => [Multibases::EncodedByteArray "00110110101100010"]


encoded = Multibases.unpack('766542')
# => #<struct Multibases::Encoded
#             code="7", encoding="base8", length=5,
#             data=[Multibases::EncodedByteArray "66542"]>

encoded.decode
# => [Multibases::DecodedByteArray "mb"]

This means that the flow of calls is as follows:

        data ➡️ (encode) ➡️ encoded data
encoded data ➡️ (pack)   ➡️ multibasestr

multibasestr ➡️ (unpack) ➡️ encoded data
encoded data ➡️ (decode) ➡️ data

Convenience methods are provided:

Multibases.pack('base2', 'mb')
# => [Multibases::EncodedByteArray "00110110101100010"]

ByteArrays and encoding

As you can see, the "final" methods output a ByteArray. These are simple DelegateClass wrappers around the array with bytes, which means that the hex encoding of hello is not actually stored as "f68656c6c6f":

packed = Multibases.pack('base16', 'hello')
# => [Multibases::EncodedByteArray "f68656c6c6f"]

packed.to_a # .__getobj__.dup
# => [102, 54, 56, 54, 53, 54, 99, 54, 99, 54, 102]

They override inspect and force the encoding to UTF-8 (in inspect), but you can use the convenience methods to use the correct encoding:

Note: If you're using pry and have not changed the printer, you naturally won't see the output as described above, but instead see the inner Array of bytes, always.

data = 'hello'.encode('UTF-16LE')
data.encoding
# => #<Encoding:UTF-16LE>

data.bytes
# => [104, 0, 101, 0, 108, 0, 108, 0, 111, 0]

packed = Multibases.pack('base16', data)
# => [Multibases::EncodedByteArray "f680065006c006c006f00"]

decoded = Multibases.decode(packed)
# => [Multibases::DecodedByteArray "h e l l o "]

decoded.to_s('UTF-16LE')
# => "hello"

Implementations

You can find the current multibase table here. At this moment, built-in engines are provided as follows:

encodingcodedescriptionimplementation
identity0x008-bit binarybare
base11unary (1111)
base20binary (0101)base2 💨
base87octalbase_x
base109decimalbase_x
base16fhexadecimalbase16 💨
base16upperFhexadecimalbase16 💨
base32hexvrfc4648 no padding - highest charbase32
base32hexupperVrfc4648 no padding - highest charbase32
base32hexpadtrfc4648 with paddingbase32
base32hexpadupperTrfc4648 with paddingbase32
base32brfc4648 no paddingbase32
base32upperBrfc4648 no paddingbase32
base32padcrfc4648 with paddingbase32
base32padupperCrfc4648 with paddingbase32
base32zhz-base-32 (used by Tahoe-LAFS)base32
base58flickrZbase58 flickerbase_x
base58btczbase58 bitcoinbase_x
base64mrfc4648 no paddingbase64 💨
base64padMrfc4648 with padding - MIME encbase64 💨
base64urlurfc4648 no paddingbase64 💨
base64urlpadUrfc4648 with paddingbase64 💨

Those with a 💨 are marked because they are backed by a C implementation (using pack and unpack) and are therefore suposed to be blazingly fast. Those with a ✨ are marked because they have a custom implementation over the generic base_x implementation. It should be faster.

The version of the spec that this repository was last updated for is available via Multibases.multibase_version:

Multibases.multibase_version
# => "1.0.0"

Bring your own engine

The methods of multibases allow you to bring your own engine, and you can safe additional memory by only loading multibases/bare.

# Note: This is not how multibase was meant to work. It's supposed to only
#       convert the input from one base to another, and denote what that base
#       is, stored in the output. However, the system is _so_ flexible that this
#       works perfectly for any reversible transformation!
class EngineKlazz
  def initialize(*_)
  end

  def encode(plain)
    plain = plain.bytes unless plain.is_a?(Array)
    Multibases::EncodedByteArray.new(plain.reverse)
  end

  def decode(encoded)
    encoded = encoded.bytes unless encoded.is_a?(Array)
    Multibases::DecodedByteArray.new(encoded.reverse)
  end
end

Multibases.implement 'reverse', 'r', EngineKlazz, 'alphabet'
# => Initializes EngineKlazz with 'alphabet'

Multibases.pack('reverse', 'md')
# => [Multibases::EncodedByteArray "rdm"]

Multibases.decode('dm')
# => [Multibases::DecodedByteArray "md"]

# Alternatively, you can pass the instantiated engine to the appropriate
# function.
engine = EngineKlazz.new

# Mark the encoding as "existing" and attach a code
Multibases.implement 'reverse', 'r'

# Pack, using a custom engine
Multibases.pack('reverse', 'md', engine)
# => [Multibases::EncodedByteArray "rdm"]

Multibases.decode('rdm', engine)
# => [Multibases::DecodedByteArray "md"]

Using the built-in encoders/decoders

You can use the built-in encoders and decoders.

require 'multibases/base16'

Multibases::Base16.encode('foobar')
# => [Multibases::EncodedByteArray "666f6f626172"]

Multibases::Base16.decode('666f6f626172')
# => [Multibases::DecodedByteArray "foobar"]

These don't add the multibase prefix to the output and they use the canonical encode and decode nomenclature.

The base_x / BaseX encoder does not have a module function. You must instantiate it first. The result is an encoder that uses the base alphabet to determine its base. Currently padding is ❌ not supported for BaseX, but might be in a future update using a second argument or key.

require 'multibases/base_x'

Base3 = Multibases::BaseX.new('012')
# => [Multibases::Base3 alphabet="012" strict]

Base3.encode('foobar')
# => [Multibases::EncodedByteArray "112202210012121110020020001100"]

You can use the same technique to inject a custom alphabet. This can be used on the built-in encoders, even the ones that are not BaseX:

base = Multibases::Base2.new('.!')
# => [Multibases::Base2 alphabet=".!"]

base.encode('foo')
# [Multibases::EncodedByteArray ".!!..!!..!!.!!!!.!!.!!!!"]

base.decode('.!!...!..!!....!.!!!..!.')
# => [Multibases::DecodedByteArray "bar"]

All the built-in encoder/decoders take strings, arrays or byte-arrays as input.

expected = Multibases::Base16.encode("abc")
# => [Multibases::EncodedByteArray "616263"]

expected == Multibases::Base16.encode([97, 98, 99])
# => true

expected == Multibases::Base16.encode(Multibases::ByteArray.new("abc".bytes))
# => true

Related

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at SleeplessByte/ruby-multibase. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Shrine::ConfigurableStorage project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.