Home

Awesome

Convolution

A Mac GUI for the deep-network convolution routines in AIToolbox

This is an interface to attempt to learn classification labels for images using the DeepNetwork class that has been added to the AIToolbox framework.

Training and testing is done with images that can be auto-generated, come from standard image files, or using standard MNIST or cifar data sets.

Make sure you have the latest AIToolbox framework installed!

I need to add some more instructions on how to use it effectively. For now, look at the example experiments at the end of this document.

Still a few missing options, but it can learn to recognizing horizontal line images and differentiating circles from lines, and even get almost 90% success on the MNIST classification problem in about a minute of training!

Concepts

DeepNetwork
Layer
Channel
Labelled Image
Batch
Epoch
Data Size
Loaded Image file

Network Operators

Convolution2D

Operation Definition

A Convolution2D operator requires a two-dimensional data array as input. The operator processes each image pixel by multiplying each pixel value of the input data, and its' neighboring values by a small 2-dimensional matrix and summing the results. For example, a 3x3 matrix convolution will multiply the pixel value by the center value of the convolution matrix, add the image value to the upper-left of the target pixel multiplied by the upper-left convolution matrix value, add the image value above the target pixel multiplied by the top-center matrix value, etc. The resulting data is a 2-dimensional array of the same size as the image data.

Network Operator Table Information

A Convolution2D operator appears in the Network Operator table with a type of "2D Convolution" and details giving the convolution matrix type and the values of the matrix in a single-dimensional array.

Definition/Editing Sheet

Convolution2D operators have a Convolution2D sheet for definition and editing. See the section for this sheet for more information.

Pooling

Operation Definition

A Pooling operator can process any data size. The operator 'pools' the data from a rectangular volume down to a single 'pixel'. The reduction size is specified as part of the operator, with each dimension of the input data being reduced by a specified amount. The data in the cell is aggregated using a selectable function, either an average, minimum, or maximum of the data in the cell.

Network Operator Table Information

A Pooling operator appears in the Network Operator table with a type of "Pooling" and details giving the pooling type (average, minimum, or maximum), and the reduction factor for each dimension.

Definition/Editing Sheet

Pooling operators have a Pooling sheet for definition and editing. See the section for this sheet for more information.

FeedForwardNN

Operation Definition

A Feed-Forward Neural Network operator can process any data size, but treats it as a single linear vector. The operator multiplies each input value by a learnable weight value and sums the results. This summation is then processed through a selectable activation function, effectively 'squashing' the summation to a managable value range. This is done for a definable number of neural network nodes in the operator. The nodes can be specified to be treated as a set of a specified dimension and size. The output is presented as a vector, array, volume, or 4-dimensional solid of dimensions matching the node sizing for the operator.

Network Operator Table Information

A Neural Network operator appears in the Network Operator table with a type of "FeedForward NN" and details giving the activation function for the network and the resulting data size (based on the number of nodes in the network).

Definition/Editing Sheet

FeedForward Neural Network operators have a Neural Network sheet for definition and editing. See the section for this sheet for more information.

NonLinearity

Operation Definition

A NonLinearity operator can process any data size, but treats it as a single linear vector. The operator performs the selected non-linearity activation function on each element of the input, resulting in an output of the same size.

Network Operator Table Information

A NonLinearity operator appears in the Network Operator table with a type of "NonLinearity" and details giving the activation function for the operation.

Definition/Editing Sheet

NonLinearity operators use the Neural Network sheet for definition and editing, with the result size disabled, leaving just the activation function. See the section for this sheet for more information.

Menu commands

Convolution->Quit
File->Open
File->Save
Network->Initialize
Network->Gradient Check

Main Window

Scale Image to Size

Training Image source

Generated/Loaded Images Radio Buttons
Loaded Image path selection
Configure Image Generator

Training

Train Button
Repeat Forever Checkbox
Batch Size Entry
Number of Epochs Entry
Training Rate Entry
Weight Decay Entry

Inputs

Input List
Add Button
Delete Button

Layers

Layer List
Add Button
Delete Button

Channels

Channel List
Add Button
Delete Button

Network Operators

Operator List
Add Button
Operator Type Selection
Edit Button
Delete Button

Neural Network Output

Output List
Resulting Class

Topology Error

Data Image

Image Source

Input Sheet

The input sheet is used to add or modify an input set definition. The following sections describe the entries in the sheet

Identifier

Data Type

Channel Sheet

The channel sheet is used to add or modify an channel definition. The following sections describe the entries in the sheet

Channel Identifier

Input Source ID

2D Convolution Sheet

The 2D convolution sheet is used to add or modify a convolution operation. The following sections describe the entries in the sheet

2D Convolution Type

Convolution Matrix

2D Convolution Matrix Types

  1. Vertical Edge 3x3 - A vertical gradient Sobel type filter
  2. Horizontal Edge 3x3 - A horizontal gradient Sobel type filter
  3. Custom 3x3 - A user-supplied 3x3 matrix
  4. Learnable 3x3 - A 3x3 matrix that has values that will be learned from the error gradient

Experiments

  1. Horizontal lines - learns to discriminate between horizontal and vertical lines
    • File->Open the Test1_Horizontal file. This is a simple network with one channel. A convolution of horizontal gradient, pooling down to 16 squares using maximum, and a two-layer neural net with 4 and 1 nodes respectively
    • Select the HorizontalTest file from the HorizontalTest directory for testing (your path will differ from mine)
    • Leave repeat training and Auto testing on.
    • Click 'Train'
    • The network should fairly quickly learn the horizontal lines, getting a 100% test rate in under a minute. If you continue, the training error will continue down, while the test error goes up! A classic case of 'overfitting'!
  2. Circles and Lines - learns to discriminate between horizontal lines and circles using a trainable convolution operator
    • File->Open the Test2_Circle file. This is a simple network with one channel. A trainable convolution operator, pooling down to 16 squares using maximum, and a two-layer neural net with 4 and 1 nodes respectively.
    • This example uses generated images, so no paths need to be set
    • Leave repeat training on.
    • For speedier training, turn off 'Auto test'
    • Click 'Train'
    • When the training error starts to go down (gets below 40 sometimes) turn auto testing back on to see the test percentage start to creep up into the 90s. Since images are all generated on the fly, there are a near-infinite number of variations, so we will likely never get a perfect score.
    • The network should medium quickly learn the differences, getting a about 95% test rate in about 5 minutes.
  3. Simple MNIST - learns to find digits in the MNIST data set with a single neural network layer
    • Download and extract the MINST data set to a folder. The data can be found here.
    • Copy the MNIST_train and MNIST_test files from the TestFiles directory into the directory with the downloaded MNIST files. These are the plist files that define the images to be used. You may need to edit them if the data file names have changed.
    • File->Open the Test3_MNIST file. This is a simple network with one channel. There is only one 10-node layer with fill connection to the 32x32 inputs.
    • The file paths saved for the experiment will not match your directory structure. Click on the file path for training and select the MNIST_train file you moved earlier. Click on the Select button for testing and select the MNIST_test file
    • Leave repeat training on.
    • Make sure 'Auto test' is of
    • Click 'Test'. You will likely get about 6.7% accuracy before training. Just guessing would get 10%.
    • Click 'Train'.
    • After about a minute or two, click Stop to stop training.
    • Click 'Test' again. I have been getting just under 90% on my tests. It is unlikely that more training will improve it much. A much more complicated network is needed to learn subtleties in the data.
  4. cifar Files - plist files are provided for the cifar data set
    • Download and extract the cifar data set to a folder. The data can be found here.
    • Copy in the cifar_train and cifar_test files from the TestFiles directory into the directory with the downloaded MNIST files. These are the plist files that define the image files to be used. You may need to edit them if the data file names have changed.

License

This program is made available with the Apache license.