Awesome
tensor-ops
Note: this project was a bit of a wild ride into how generic I could make tensor operations, and was an ill-fated integration of automatic differentiation and tensor operations. See backprop-learn for a more reasonable take, with much more simple type-level programming.
tensor-ops is a library for specifying type-safe tensor manipulations that can be automatically differentiated using reverse-mode AD for automatic gradient descent.
The center of everything is a data type TOp
that represents a tensor operation:
TOp :: [[k]] -> [[k]] -> Type
The first argument is a list of the dimensions of the input tensors, and the second argument is a list of the dimensions of the output tensors. So, for example, multiplying a 4x5 matrix with a 5-vector to get a 4-vector would be a:
mulFoo :: TOp '[ '[4, 5], '[5] ] '[ '[4] ]
using DataKinds to get type-level lists and numerics. (The actual type is kind-polymorphic on what you use to indicate dimensions)
A simple feed-forward neural network layer might be represented as:
ffLayer :: TOp '[ '[i], '[o, i], '[o] ] '[ '[o] ]
Where the layer takes an input i
-vector, an o
-by-i
weights matrix, and an
o
-vector of biases, and produces an o
-vector of resuts.
One important thing that makes TOp
s usable is that they compose: that
is, that you can build complex TOp
s by composing and putting together
smaller, simpler ones. The ffLayer
TOp
above might, for example, be
created by chaining a matrix-vector multiplication, a vector-vector addition,
and an operation that maps the logistic function over every element in a
tensor.
TOp
s can be "run" on any instance of the Tensor
typeclass:
runTOp
:: Tensor t
=> TOp ns ms
-> Prod t ns
-> Prod t ms
(Prod f as
is a heterogeneous vinyl-like list, where Prod f '[a,b,c]
contains an f a
, an f b
, and an f c
. So here, it's a collection of
tensors of each of the input dimensions.)
The key to the usefulness of this library is that you can also calculate the
gradient of a TOp
, using reverse-mode tensor-level automatic
differentation:
gradTOp
:: Tensor t
=> TOp ns '[ '[] ] -- result must be single scalar
-> Prod t ns
-> Prod t ns
So, given a TOp
producing a scalar and collection of input tensors, it
outputs the derivative of the resulting scalar with respect to each of the
input tensors.
If you run it on ffLayer
and x
, w
, and b
(input, weights, biases) with
a TOp
producing an error function, it'll output tensors of the same size
as x
, w
, and b
, indicating the direction to move each of them to in to
minimize the error function. To train the network, you'd step w
and b
in
that indicated direction.
Right now the user-facing API is non-existent, and usage requires juggling a lot of extra singletons for type inference to work. But hopefully these can be ironed out in a final version.
Currently, the only way to compose TOp
s is through their Category
instance,
and a pseudo-Arrow-like interface:
-- Compose `TOp`s, from Control.Category
(>>>)
:: TOp ns ms
-> TOp ms os
-> TOp ns os
-- Similar to `first` from Control.Arrow
firstOp
:: forall os ns ms. ()
=> TOp ns ms
-> TOp (ns ++ os) (ms ++ os)
-- Similar to `second` from Control.Arrow
secondOp
:: forall os ns ms. ()
=> TOp ns ms
-> TOp (os ++ ns) (os ++ ms)
And so, you can write ffLayer
above as:
import TensorOps.Types
import qualified TensorOps.TOp as TO
ffLayer'
:: TOp '[ '[i], '[o,i], '[o] ] '[ '[o] ]
ffLayer' = firstOp (TO.swap >>> TO.matVec)
>>> TO.add
Where TO.swap >>> TO.matVec
turns '[ '[i], '[o,i] ]
into '[o]
by swapping
and doing a matrix-vector multiplication, and
TO.add :: TOp '[ '[o], '[o] ] '[ '[o] ]
adds the result with the bias vector.
This is mostly done for now for simplicity in implementation, to avoid having to worry about representing abstract functions and indexing schemes and stuff like that. But, it does make more complex functions a bit of a hassle, so a more flexible/easy-to-write formulation might be in the future.
Of course, composition is always type-safe (with tensor dimensions), and you can't compose two actions that don't agree on tensor dimensions. Yay dependent types.
There are currently three working instances of Tensor
:
- A nested linked list implementation
- A nested vector implementation
- An hmatrix implementation (facilitated by nested vectors for ranks > 2)
If you want to write your own by providing basic linear algebra functions, make
your type an instance of the BLAS
typeclass (which contains the BLAS
interface, plus some helpers) and you get it for free! See
TensorOps.BLAS
for details and TensorOps.BLAS.HMat
for an
example implementation.
Haddocks are put up and rendered here, but the library is not super
well commented at this point! Most of the API is in the TensorOps
top-level
module.
There are currently two applications just for testing out the features by using
them to implement neural networks: tensor-ops-dots, which uses a simple tiny
network for learning a non-linear 2D function with interchangeable backends,
and tensor-ops-mnist, which uses a feed-forward neural network w/ the
hmatrix backend to classify the ubiquitous mnist data set. Use with
--help
for more information.