Home

Awesome

For more blockchain books, see the Best of Crypto Books page »


Programming Bitcoin Script Transaction (Crypto) Contracts Step-by-Step

Let's start with building your own bitcoin stack machine from zero / scratch and let's run your own bitcoin ops (operations)...

BEWARE: Bitcoin is a gigantic ponzi scheme¹. To the moon!? The new gold standard!? Do NOT "invest" trying to get-rich-quick HODLing. Why not? The bitcoin code is archaic and out-of-date. Burn, baby, burn! Proof-of-work / waste is a global energy environmental disaster using 300 kWh per bitcoin transaction (!) that's about 179 kilograms of CO₂ emissions². Programmable money (or the internet of value) for all future generations with (bitcoin) script without loops and jumps (gotos) and all "stateless"!? LOL.

¹:

(Source: Best of Bitcoin Maximalist - Scammers, Morons, Clowns, Shills & BagHODLers - Inside The New New Crypto Ponzi Economics)

²: Assuming let's say 0.596 kilograms of CO₂ per kWh (that's the energy efficiency in Germany) that's about 179 kilograms of CO₂ per bitcoin transaction (300 kWh × 0.596 kg). For more insights see the Bitcoin Energy Consumption Index.

Inside Bitcoin Script

Did you know? Every (yes, every) bitcoin transaction (payment) runs a contract script (one half coming from the "output" or "lock" transaction and the other half coming from the "input" or "unlock" transaction). The programming language is called simply (bitcoin) script.

Bitcoin uses a scripting system for transactions. Forth-like, Script is simple, stack-based, and processed from left to right. It is intentionally not Turing-complete, with no loops.

(Source: Script @ Bitcoin Wiki)

First impression. Adding 2+2 in Bitcoin Script starting from zero / scratch:

## A simple stack machine
def op_add( stack )
  left  = stack.pop
  right = stack.pop
  stack.push( left + right )
end

def op_2( stack )
  stack.push( 2 )
end

## Let's run!

stack = []
op_2( stack )     #=> stack = [2]
op_2( stack )     #=> stack = [2,2]
op_add( stack )   #=> stack = [4]

(Source: stackmachine_add.rb)

Yes, that's all the magic! You have built your own stack machine with two operations / ops, that is, op_add and op_2.

The op_2 operation pushes the number 2 onto the stack. The op_add operation pops the top two numbers from the stack and pushes the result onto the stack.

Aside - What's a Stack? Push 'n' Pop

A stack is a last-in first-out (LIFO) data structure. Use push to add an element to the top of the stack and use pop to remove the top element from the stack. Example:

stack = []                   #=> []
stack.empty?                 #=> true

stack.push( 1 )              #=> [1]
stack.empty?                 #=> false
stack.push( 2 )              #=> [1, 2]
stack.push( 3 )              #=> [1, 2, 3]
stack.push( "<signature>" )  #=> [1, 2, 3, "<signature>"]
stack.push( "<pubkey>")      #=> [1, 2, 3, "<signature>", "<pubkey>"]

stack.pop                    #=> "<pubkey>"
stack                        #=> [1, 2, 3, "<signature>"]
stack.pop                    #=> "<signature>"
stack                        #=> [1, 2, 3]

stack.push( 4 )              #=> [1, 2, 3, 4]
stack.push( 5 )              #=> [1, 2, 3, 4, 5]

stack.pop                    #=> 5
stack                        #=> [1, 2, 3, 4]
stack.pop                    #=> 4
stack                        #=> [1, 2, 3]
stack.pop                    #=> 3
stack                        #=> [1, 2]
stack.pop                    #=> 2
stack                        #=> [1]
stack.empty?                 #=> false
stack.pop                    #=> 1
stack                        #=> []
stack.empty?                 #=> true
stack.pop                    #=> nil

(Source: stack.rb)

Unlock+Lock / Input+Output / ScriptSig+ScriptPubKey

In "real world" bitcoin the script has two parts / halves in two transactions that get combined. The "lock" or "output" or "ScriptPubKey" script that locks the "unspent transaction output (UTXO)", and the "unlock" or "input" or "ScriptSig" script that unlocks the bitcoins.

Anyone Can Spend (Unlock) the Outputs (Bitcoins)

The bitcoins are yours if the bitcoins haven't been spent yet - see blockchain and how it solves the double-spending problem :-) - AND if the script returns with true, that is, 1 is on top of the stack.

## A simple stack machine
def op_true( stack )
  stack.push( 1 )
end

## Let's run!

stack = []
##  I) ScriptSig (input/unlock) part
op_true( stack )  #=> stack = [1]

## II) ScriptPubKey (output/lock) part
##     <Empty>

(Source: stackmachine_anyone.rb)

Bingo! Yes, that's all the magic! The op_true operation pushes the number 1, that is, true onto the stack.

The "official" bitcoin script notation reads:

ScriptSig (input):    OP_TRUE
ScriptPubKey:         (empty)

Now let's split the adding 2+2 script into a two part puzzle, that is, ?+2=4 or into ScriptSig and ScriptPubKey. If you know the answer you can "unlock" the bounty, that is, the bitcoins are yours! Here's the challenge:

## A simple stack machine
def op_add( stack )
  left  = stack.pop
  right = stack.pop
  stack.push( left + right )
end

def op_2( stack )
  stack.push( 2 )
end

def op_4( stack )
  stack.push( 4 )
end

def op_equal( stack )
  left  = stack.pop
  right = stack.pop
  stack.push( left == right ? 1 : 0 )
end

## Let's run!

stack = []
##  I) ScriptSig (input/unlock) part
##     FIX!!! - add your "unlock" stack operation / operations here

## II) ScriptPubKey (output/lock) part
op_2( stack )      #=> stack = [?, 2]
op_add( stack )    #=> stack = [4]
op_4( stack )      #=> stack = [4,4]
op_equal( stack )  #=> stack = [1]

(Source: stackmachine_puzzle.rb)

The "official" bitcoin script notation reads:

ScriptSig (input):    ?
ScriptPubKey:         OP_2 OP_ADD OP_4 OP_EQUAL

If you check all Bitcoin script operations - the following ops should no longer be a mystery:

Constants

WordOpcodeHexInputOutputDescription
OP_0, OP_FALSE00x00Nothing.(empty value)An empty array of bytes is pushed onto the stack. (This is not a no-op: an item is added to the stack.)
OP_1, OP_TRUE810x51Nothing.1The number 1 is pushed onto the stack.
OP_2-OP_1682-960x52-0x60Nothing.2-16The number in the word name (2-16) is pushed onto the stack.

Bitwise logic

WordOpcodeHexInputOutputDescription
OP_EQUAL1350x87x1 x2True / falseReturns 1 if the inputs are exactly equal, 0 otherwise.

Arithmetic

WordOpcodeHexInputOutputDescription
OP_ADD1470x93a bouta is added to b.
OP_MUL1490x95a bouta is multiplied by b. disabled.
OP_DIV1500x96a bouta is divided by b. disabled.

Trivia Corner: Did you know? The OP_MUL for multiplications (e.g. 2*2) has been banned, that is, disabled! Why? Because of security concerns, that is, fear of stack overflows. What about OP_DIV for divisions (e.g. 4/2)? Don't ask! Ask who's protecting you from stack underflows? So what's left for programming - not much really other than checking signatures and timelocks :-).

Standard Scripts

You don't have to start from zero / scratch. Bitcoin has many standard script templates. The most important include:

Short NameLong Name
p2pkPay-to-pubkey
p2pkhPay-to-pubkey-hash
p2shPay-to-script-hash

Standard Scripts with SegWit (Segregated Witness)

Short NameLong Name
p2wpkhPay-to-witness-pubkey-hash
p2wshPay-to-witness-script-hash

p2pk - Pay-to-pubkey

Pay-to-pubkey (p2pk) is the simplest standard script and was used in the early days including by Satoshi Nakamoto (the pseudonymous Bitcoin founder).

Bitcoin Trivia:

As initially the sole and subsequently the predominant miner, Nakamoto was awarded bitcoin at genesis and for 10 days afterwards. Except for test transactions these remain unspent since mid January 2009. The public bitcoin transaction log shows that Nakamoto's known addresses contain roughly one million bitcoins. At bitcoin's peak in December 2017, this was worth over US$19 billion, making Nakamoto possibly the 44th richest person in the world at the time.

(Source: Satoshi Nakamoto @ Wikipedia)

The one million bitcoins are yours if the pay-to-pubkey (p2pk) script returns with true, that is, 1 is on top of the stack. The only input you need to unlock the the fortune is the signature. Are you Satoshi? Let's try:

## Bitcoin crypto helper

class Bitcoin
  def self.checksig( sig, pubkey )
    ## "crypto" magic here
    ##  for testing always return false for now; sorry
    false
  end
end  


## A simple stack machine

def op_checksig( stack )
  pubkey = stack.pop
  sig    = stack.pop
  if Bitcoin.checksig( sig, pubkey )
    stack.push( 1 )
  else
    stack.push( 0 )
  end
end

## Let's run!

stack = []
##  I) ScriptSig (input/unlock) part
stack.push( "<sig>" )   #=> stack = ["<sig>"]

## II) ScriptPubKey (output/lock) part
stack.push( "<pubkey")  #=> stack = ["<sig>", "<pubkey>" ]
op_checksig( stack )    #=> stack = [0]

(Source: pay-to-pubkey.rb)

Bingo! Yes, that's all the magic! The op_checksig operation pops two elements from the stack, that is, the public key (pubkey) and the signature (sig) and if the elliptic curve crypto validates the signature (from the input/unlock transaction) using the public key (from the the output/lock transaction) then the fortune is yours! If not the number 0, that is, false gets pushed onto the stack and you're out of luck. Sorry.

The "official" bitcoin script notation reads:

ScriptSig (input): <sig>
ScriptPubKey:      <pubKey> OP_CHECKSIG

Note: Can you guess where the input / unlock part got its ScriptSig name and where the output / lock part got its ScriptPubKey name? Yes, from the pay-to-pubkey script.

Aside - Ivy - Higher-Level Bitcoin Script Language

What's Ivy?

From the project's readme:

Ivy is a higher-level language that allows you to write (crypto) contracts for the Bitcoin protocol. Ivy can compile to opcodes for Bitcoin’s stack machine, Bitcoin Script, and can be used to create SegWit-compatible Bitcoin addresses...

You can try out Ivy using the Ivy Playground for Bitcoin, which allows you to create test contracts and try spending them, all in a sandboxed environment.

(Source: Ivy Language Documentation)

Let's look at the pay-to-pubkey script in Ivy:

contract LockWithPublicKey(publicKey: PublicKey, val: Value) {
  clause spend(sig: Signature) {
    verify checkSig(publicKey, sig)
    unlock val
  }
}

And - surprise, surprise - the higher-level script compiles to

<pubKey> OP_CHECKSIG

Elliptic Curve Cryptography

So what does a "real world" public key (pubkey) look like? In the early days Satoshi Nakamoto used the uncompressed SEC (Standards for Efficient Cryptography) format for the public key that results in 65 raw bytes. Bitcoin uses elliptic curve cryptography and the public key is a point (x,y) on the curve where the x and y coordinates are each 256-bit (32 byte) numbers.

In the uncompressed format, place the x and y coordinate next to each other, then prefix with 04 to indicate that it is an uncompressed public key:

prefix (1 byte)         : 04
x-coordinate (32 bytes) : fe53c78e36b86aae8082484a4007b706d5678cabb92d178fc95020d4d8dc41ef
y-coordinate (32 bytes) : 44cfbb8dfa7a593c7910a5b6f94d079061a7766cbeed73e24ee4f654f1e51904
    =>
04fe53c78e36b86aae8082484a4007b706d5678cabb92d178fc95020d4d8dc41ef44cfbb8dfa7a593c7910a5b6f94d079061a7766cbeed73e24ee4f654f1e51904

And in the compressed form because the elliptic curve is symmetrical along its x-axis, the trick is that each x-coordinate will only ever have one of two possible y coordinates:

Thus, in the compressed public key format place the x coordinate along with a prefix (02 or 03) that tells whether the y is even (02) or odd (03).

prefix (1 byte)         : 03
x-coordinate (32 bytes) : df51984d6b8b8b1cc693e239491f77a36c9e9dfe4a486e9972a18e03610a0d22
     =>
03df51984d6b8b8b1cc693e239491f77a36c9e9dfe4a486e9972a18e03610a0d22

Let's create a public key from the private key

Note: Let's use the 3rd party Elliptic Curve Digital Signature Algorithm (ECDSA) library / gem by David Grayson.

require 'pp'
require 'ecdsa'           # Use an elliptic curve library

# This private key is just an example. It should be much more secure!
privatekey = 1234

# Elliptic curve multiplication
group = ECDSA::Group::Secp256k1                          # Select the curve used in Bitcoin
point = group.generator.multiply_by_scalar( privatekey ) # Multiply by integer (not hex)
#=> <ECDSA::Point: secp256k1,
#       0xe37648435c60dcd181b3d41d50857ba5b5abebe279429aa76558f6653f1658f2,
#       0x6d2ee9a82d4158f164ae653e9c6fa7f982ed8c94347fc05c2d068ff1d38b304c>

# Uncompressed format (with prefix 04)
#   Convert to 64 hexstring characters (32 bytes) in length
prefix = '04'
pubkey = prefix + "%064x" % point.x + "%064x" % point.y
#=> "04e37648435c60dcd181b3d41d50857ba5b5abebe279429aa76558f6653f1658f26d2ee9a82d4158f164ae653e9c6fa7f982ed8c94347fc05c2d068ff1d38b304c"

# Compressed format (with prefix - 02 = even / 03 = odd)
#   Instead of using both x and y coordinates,
#   just use the x-coordinate and whether y is even/odd
prefix = point.y % 2 == 0 ? '02' : '03'
pubkey = prefix + "%064x" % point.x
#=> "02e37648435c60dcd181b3d41d50857ba5b5abebe279429aa76558f6653f1658f2"

(Source: pubkey.rb)

p2pkh - Pay-to-pubkey-hash

...

Aside - What's Hash160?

It's a hash function to hash and shorten public keys. Public keys if uncompressed shorten from 65 bytes to 20 bytes (or if compressed from 33 bytes). Example:

pubkey          = 02b4632d08485ff1df2db55b9dafd23347d1c47a457072a1e87be26896549a8737
hash160(pubkey) = 93ce48570b55c42c2af816aeaba06cfee1224fae

To compute the Hash160 run the public key through the SHA256 and RIPEMD160 hash functions. Example:

require 'digest'                           # Hash (Digest) Functions

def hash160( pubkey )
  binary    = [pubkey].pack( "H*" )       # Convert to binary first before hashing
  sha256    = Digest::SHA256.digest( binary )
  ripemd160 = Digest::RMD160.digest( sha256 )
              ripemd160.unpack( "H*" )[0]    # Convert back to hex
end

pubkey = "02b4632d08485ff1df2db55b9dafd23347d1c47a457072a1e87be26896549a8737"
hash160( pubkey )  
#=> "93ce48570b55c42c2af816aeaba06cfee1224fae"

(Source: hash160.rb)

Security Trivia I: Why use SHA256 and RIPEMD160?

RIPEMD160 gets used because it results in a short 160 bit (20 byte) digest BUT is not the strongest hash function on it's own, thus, SHA256 gets used for more strength. Best of both world.

Security Trivia II: What's RIPEMD160?

RACE¹ Integrity Primitives Evaluation Message Digest 160-bit

¹: Research and development in Advanced Communications technologies in Europe

def ripemd160( message )
  Digest::RMD160.hexdigest( message )
end

ripemd160( "The quick brown fox jumps over the lazy dog" )
#=> "37f332f68db77bd9d7edd4969571ad671cf9dd3b"

ripemd160( "The quick brown fox jumps over the lazy cog" )
#=> "132072df690933835eb8b6ad0b77e7b6f14acad7"

# The hash of a zero-length string is:
ripemd160( "" )
#=> "9c1185a5c5e9fc54612808977ee8f548b2258d31"

(Source: RIPEMD @ Wikipedia)

...

The "official" bitcoin script notation reads:

ScriptSig (input): <sig> <pubKey>
ScriptPubKey:      OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

And the Ivy higher-level version reads:

contract LockWithPublicKeyHash(pubKeyHash: Hash160(PublicKey), val: Value) {
  clause spend(pubKey: PublicKey, sig: Signature) {
    verify hash160(pubKey) == pubKeyHash
    verify checkSig(pubKey, sig)
    unlock val
  }
}

that compiles to

OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

To be continued ...

Appendix

Aside - Simplicity - A New Bitcoin Contract Language?

Simplicity is a blockchain programming language designed as an alternative to Bitcoin script.

(Source: Simplicity README)

Why Simplicity?

Bitcoin's Script language is generally limited to combinations of digital signature checks, timelocks, and hashlocks. While impressive protocols (such as the Lightning Network) have been built on these primitives, Bitcoin's Script language lacks the expressiveness needed for more complex contract scripts.

(Source: Simplicity: High-Assurance Bitcoin Contract Scripting by Russell O'Connor, Andrew Poelstra, Blockstream Research, November 2018)

Simplicity: A New Language for Blockchains (Whitepaper Abstract)

Simplicity is a typed, combinator-based, functional language without loops and recursion, designed to be used for crypto-currencies and blockchain applications. It aims to improve upon existing crypto-currency languages, such as Bitcoin's Script, Ethereum's Solidity or Michelson's Liquidity, while avoiding some of the problems they face. Simplicity comes with formal denotational semantics defined in Coq, a popular, general purpose software proof assistant. Simplicity also includes operational semantics that are defined with an abstract machine that we call the Bit Machine. The Bit Machine is used as a tool for measuring the computational space and time resources needed to evaluate Simplicity programs. Owing to its Turing incompleteness, Simplicity is amenable to static analysis that can be used to derive upper bounds on the computational resources needed, prior to execution. While Turing incomplete, Simplicity can express any finitary function, which we believe is enough to build useful contracts for blockchain applications.

(Source: Simplicity Whitepaper (PDF) by Russell O'Connor, Blockstream, December 2017)

Resources

Articles

Books / Series

<!-- break -->

Talk Notes

Code

License

The Programming Bitcoin Script Step-by-Step book / guide is dedicated to the public domain. Use it as you please with no restrictions whatsoever.