{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "In the [previous post](http://andrew.gibiansky.com/blog/machine-learning/nram-1/), we discussed a simplified neural random-access memory machine (NRAM). We dubbed this simplified machine a neural *register* machine, since it lacks random-access memory but has a fixed set of registers to store data. In this post, we'll implement the register machine using [Theano](http://deeplearning.net/software/theano/), and then train it on a very simple example. We'll then demonstrate how we could extend this to the full NRAM model, showing how to implement the READ and WRITE gates.\n", "\n", "This post is meant as a code-heavy demonstration of how to implement a model like the one under discussion, and all the subtleties that have to be dealt with. You will not get much out of this post unless you are willing to read the code carefully and scrutinize it until you are sure you understand it!\n", "\n", "This post is a runnable IPython notebook, which you can download [here](data/post.ipynb). While reading it, I recommend you refer to the [previous post](http://andrew.gibiansky.com/blog/machine-learning/nram-1/) for the theory being implemented. The code for the full NRAM model is available [on Github](https://github.com/gibiansky/experiments/tree/master/nram)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll start by importing the the necessary libraries, namely Theano and Numpy:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import theano\n", "from theano import tensor\n", "import numpy as np\n", "\n", "from collections import namedtuple" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Implementing Gates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, let's define a few gates. Recall from our previous posts that gates can be constants, single-argument functions, or two-argument functions, and operate on probability distributions over the integers 0 to $M - 1$.\n", "\n", "We'll represent gates via a `Gate` `namedtuple` with an `arity` field (0 for constants, 1 for single argument functions, etc) and a `module` field (which stores the actual function the gate applies):" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "Gate = namedtuple(\"Gate\", \"arity module\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Constant Gates\n", "The simplest gates to define are constants. The probability distribution for a constant is a [one-hot encoding](https://en.wikipedia.org/wiki/One-hot) of the value; for example, the integer one is converted to a vector in which the 2nd element (index 1) is one, and all others are zero. This vector represents the probability distribution which, when sampled, returns one always.\n", "\n", "Let's define a few useful constant gates:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Theano's to_one_hot converts a numpy array\n", "# to a one-hot encoded version. In order to\n", "# know how big to make the vector, all gates\n", "# take M, the smallest non-representable integer,\n", "# as an argument.\n", "from theano.tensor.extra_ops import to_one_hot\n", "\n", "def make_constant_gate(value):\n", " \"\"\"Create a gate that returns a constant distribution.\"\"\"\n", " # Arguments to to_one_hot must be Numpy arrays.\n", " arr = np.asarray([value])\n", " \n", " def module(max_int):\n", " \"\"\"Return the one-hot encoded constant.\"\"\"\n", " return to_one_hot(arr, max_int)\n", " \n", " arity = 0\n", " return Gate(arity, module)\n", "\n", "# Gates for 0, 1, 2: useful constants!\n", "gate_zero = make_constant_gate(0)\n", "gate_one = make_constant_gate(1)\n", "gate_two = make_constant_gate(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "