Home

Awesome

MLKernels

CI Coverage Status Latest Docs Code style: black

Kernels, the machine learning ones (also, mean functions)

Contents:

TLDR:

>>> from mlkernels import EQ, Linear

>>> k1 = 2 * Linear() + 1

>>> k1
2 * Linear() + 1

>>> k1(np.linspace(0, 1, 3))  # Structured matrices enable efficiency.
<low-rank matrix: shape=3x3, dtype=float64, rank=2
 left=[[0.  1. ]
       [0.5 1. ]
       [1.  1. ]]
 middle=[[2. 0.]
         [0. 1.]]
 right=[[0.  1. ]
        [0.5 1. ]
        [1.  1. ]]>

>>> import lab as B

>>> B.dense(k1(np.linspace(0, 1, 3)))  # Discard structure: get a regular NumPy array.
array([[1. , 1. , 1. ],
       [1. , 1.5, 2. ],
       [1. , 2. , 3. ]])

>>> k2 = 2 + EQ() * Linear()

>>> k2
2 * 1 + EQ() * Linear()

>>> k2(np.linspace(0, 1, 3))
<dense matrix: shape=3x3, dtype=float64
 mat=[[2.    2.    2.   ]
      [2.    2.25  2.441]
      [2.    2.441 3.   ]]>

>>> B.dense(k1(np.linspace(0, 1, 3)))
array([[2.        , 2.        , 2.        ],
       [2.        , 2.25      , 2.44124845],
       [2.        , 2.44124845, 3.        ]])

Installation

pip install mlkernels

See also the instructions here.

Usage

Let k be a kernel, e.g. k = EQ().

Example:

>>> k = EQ()

>>> k(np.linspace(0, 1, 3))
<dense matrix: shape=3x3, dtype=float64
 mat=[[1.    0.882 0.607]
      [0.882 1.    0.882]
      [0.607 0.882 1.   ]]>
 
>>> k.elwise(np.linspace(0, 1, 3), 0)
array([[1.        ],
       [0.8824969 ],
       [0.60653066]])

Let m be a mean, e.g. m = TensorProductMean(lambda x: x ** 2).

Example:

>>> m = TensorProductMean(lambda x: x ** 2)

>>> m(np.linspace(0, 1, 3))
array([[0.  ],
       [0.25],
       [1.  ]])

Inputs to kernels and means are interpreted in the following way:

Example:

>>> k = EQ()

>>> k(0)  # One scalar input
<dense matrix: batch=(), shape=(1, 1), dtype=float64
 mat=[[1.]]>

>>> k(np.linspace(0, 1, 3))  # Three single-dimensional inputs
<dense matrix: batch=(), shape=(3, 3), dtype=float64
 mat=[[1.    0.882 0.607]
      [0.882 1.    0.882]
      [0.607 0.882 1.   ]]>

>>> k(np.random.randn(3, 2))  # Three two-dimensional inputs
<dense matrix: batch=(), shape=(3, 3), dtype=float64
 mat=[[1.    0.606 0.399]
      [0.606 1.    0.931]
      [0.399 0.931 1.   ]]>

>>> k(np.random.randn(2, 3, 2))  # A batch of two times three two-dimensional inputs
<dense matrix: batch=(2,), shape=(3, 3), dtype=float64
 mat=[[[1.    0.15  0.559]
       [0.15  1.    0.678]
       [0.559 0.678 1.   ]]

      [[1.    0.891 0.627]
       [0.891 1.    0.638]
       [0.627 0.638 1.   ]]]>

Finally, if you are simultaneously computing means and kernel matrices, then speed-ups may be possible. To obtain these speed-ups, use mean_var instead of first calling m(x) and then k(x); and use mean_var_diag instead of first calling m(x) and then k.elwise(x).

Example:

>>> from mlkernels import mean_var, mean_var_diag

>>> m = TensorProductMean(lambda x: x ** 2)

>>> k = EQ()

>>> x = np.linspace(0, 1, 3)

>>> m(x), k(x)         # Less efficient
(array([[0.  ],
        [0.25],
        [1.  ]]),
 <dense matrix: batch=(), shape=(3, 3), dtype=float64
  mat=[[1.    0.882 0.607]
       [0.882 1.    0.882]
       [0.607 0.882 1.   ]]>)

>>> mean_var(m, k, x)  # More efficient
(array([[0.  ],
        [0.25],
        [1.  ]]),
 <dense matrix: batch=(), shape=(3, 3), dtype=float64
  mat=[[1.    0.882 0.607]
       [0.882 1.    0.882]
       [0.607 0.882 1.   ]]>)

>>> m(x), k.elwise(x)       # Less efficient
(array([[0.  ],
        [0.25],
        [1.  ]]),
 array([[1.],
        [1.],
        [1.]]))

>>> mean_var_diag(m, k, x)  # More efficient
(array([[0.  ],
        [0.25],
        [1.  ]]),
 array([[1.],
        [1.],
        [1.]]))

AutoGrad, TensorFlow, PyTorch, or JAX? Your Choice!

from mlkernels.autograd import EQ, Linear
from mlkernels.tensorflow import EQ, Linear
from mlkernels.torch import EQ, Linear
from mlkernels.jax import EQ, Linear

Available Kernels

See here for a nicely rendered version of the list below.

Constants function as constant kernels. Besides that, the following kernels are available:

Available Means

Constants function as constant means. Besides that, the following means are available:

Compositional Design

Displaying Kernels and Means

Kernels and means have a display method. The display method accepts a callable formatter that will be applied before any value is printed. This comes in handy when pretty printing kernels.

Example:

>>> print((2.12345 * EQ()).display(lambda x: f"{x:.2f}"))
2.12 * EQ()

Properties of Kernels and Means

Structured Matrix Types

MLKernels uses an extension of LAB to accelerate linear algebra with structured linear algebra primitives.

Example:

>>> import lab as B

>>> k = 2 * Delta()

>>> x = np.linspace(0, 5, 10)

>>> from mlkernels import pairwise

>>> k(x)  # Preserve structure.
<diagonal matrix: shape=10x10, dtype=float64
 diag=[2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]>

>>> B.dense(k(x))  # Do not preserve structure.
array([[2., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 2., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 2., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 2., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 2., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 2., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 2., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 2., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 2., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 2.]])

These structured matrices are compatible with LAB: they know how to efficiently add, multiply, and do other linear algebra operations.

>>> import lab as B

>>> B.matmul(pairwise(k, x), pairwise(k, x))
<diagonal matrix: shape=10x10, dtype=float64
 diag=[4. 4. 4. 4. 4. 4. 4. 4. 4. 4.]>

As in the above example, you can convert structured primitives to regular NumPy/TensorFlow/PyTorch/JAX arrays by calling B.dense:

>>> import lab as B

>>> B.dense(B.matmul(pairwise(k, x), pairwise(k, x)))
array([[4., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 4., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 4., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 4., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 4., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 4., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 4., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 4., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 4.]])

Implementing Your Own Kernel

An example is most helpful:

import lab as B
from algebra.util import identical
from matrix import Dense
from plum import dispatch

from mlkernels import Kernel, pairwise, elwise


class EQWithLengthScale(Kernel):
    """Exponentiated quadratic kernel with a length scale.

    Args:
        scale (scalar): Length scale of the kernel.
    """

    def __init__(self, scale):
        self.scale = scale

    def _compute(self, dists2):
        # This computes the kernel given squared distances. We use `B` to provide a
        # backend-agnostic implementation.
        return B.exp(-0.5 * dists2 / self.scale ** 2)

    def render(self, formatter):
        # This method determines how the kernel is displayed.
        return f"EQWithLengthScale({formatter(self.scale)})"

    @property
    def _stationary(self):
        # This method can be defined to return `True` to indicate that the kernel is
        # stationary. By default, kernels are assumed to not be stationary.
        return True

    @dispatch
    def __eq__(self, other: "EQWithLengthScale"):
        # If `other` is also a `EQWithLengthScale`, then this method checks whether 
        # `self` and `other` can be treated as identical for the purpose of 
        # algebraic simplifications. In this case, `self` and `other` are identical 
        # for the purpose of algebraic simplification if `self.scale` and
        # `other.scale` are. We use `algebra.util.identical` to check this condition.
        return identical(self.scale, other.scale)


# It remains to implement pairwise and element-wise computation of the kernel.


@pairwise.dispatch
def pairwise(k: EQWithLengthScale, x: B.Numeric, y: B.Numeric):
    return Dense(k._compute(B.pw_dists2(x, y)))


@elwise.dispatch
def elwise(k: EQWithLengthScale, x: B.Numeric, y: B.Numeric):
    return k._compute(B.ew_dists2(x, y))
>>> k = EQWithLengthScale(2)

>>> k
EQWithLengthScale(2)

>>> k == EQWithLengthScale(2)
True

>>> 2 * k == k + EQWithLengthScale(2)
True

>>> k == Linear()
False

>>> k_composite = (2 * k + Linear()) * RQ(2.0)

>>> k_composite
(2 * EQWithLengthScale(2) + Linear()) * RQ(2.0)

>>> k_composite(np.linspace(0, 1, 3))
<dense matrix: shape=3x3, dtype=float64
 mat=[[2.    1.717 1.13 ]
      [1.717 2.25  2.16 ]
      [1.13  2.16  3.   ]]>

Of course, in practice we do not need to implement variants of kernels which include length scales, because we always adjust the length scale by stretching a base kernel:

>>> EQ().stretch(2)(np.linspace(0, 1, 3))
<dense matrix: shape=3x3, dtype=float64
 mat=[[1.    0.969 0.882]
      [0.969 1.    0.969]
      [0.882 0.969 1.   ]]>

>>> EQWithLengthScale(2)(np.linspace(0, 1, 3))
<dense matrix: shape=3x3, dtype=float64
 mat=[[1.    0.969 0.882]
      [0.969 1.    0.969]
      [0.882 0.969 1.   ]]>