Actually, let's make a table of contents:

- Neural Nets in Rust: Starts with this post.
- Nobilis 3e Character Creation: Starts with this post.

At the same time as I do other things, I also work on various hobby coding projects. One that I recently started focusing on again lately is an attempt to roll my own neural net library, because the stuff I've looked at confuses me, so maybe I'll get some insight if I make one myself.

*The story so far...*

Several months ago, I read up on basic machine learning stuff, and put together a primitive autoencoder in Python. The design had some interesting ideas, which I'll try to carry forward, and some dodgy implementation details, which I'll try not to.

Crash course on the design: I ended up foundering in my attempts to understand tensor concepts. In order to have some concept of what nets are doing, I started translating the vector behaviors in the writeups into very localized descriptions of what happens to single units in the network, then worked back up from there.

The architecture that I ended up going for was a symbolic DAG (directed acyclic graph) of simple mathematical operations. To explain what this was like, consider an algebraic expression, like (5x + 3y)(56x-1)^5. That expression can be thought of as a tree structure:

Code: Select all

```
*
/ \
+ ^
/ \ / \
* * - 5
/ \ / \ / \
5 x 3 y * 1
/ \
56 x
```

My solution to the latter problem was to develop a canonical form for the nodes. Equivalent node values will draw their arguments into a common form, making it trivial to confirm that two nodes are identical. This ability then feeds into solving the first problem: instead of a tree, the nodes are in a DAG, where any particular expression is only instantiated once. I'll get into how all this works in detail after I've posted the code.

First up, the concepts the library can express: addition, multiplication, hyperbolic tangent, inputs to the network, trainable values, and named constants. Each of these can be differentiated with respect to a trainable value, and the result expressed simply, solely in terms of the concepts that are already defined. Addition and multiplication work as in calculus, trainable values are 1 with respect to themselves, and 0 with respect to everything else, and similarly for the inputs and constants. The derivative of tanh(x) with respect to x is just 1-tanh^2(x), so that's no big deal either.

With all that said, here's the code I'm working with now: http://pastebin.com/B5tx7ekx. This is actually a rewrite of the first, more confusing iteration that tried to do performance hacks in Pypy. Now, you'll notice that this code is in Python, not Rust. So what gives? Well, this code is relatively strictly typed, and I'd like to see how it does, given the ability to take advantage of that, and also to get some hands-on experience with Rust. (You'll also notice that there's no code to actually evaluate a DAG on an input. That got stripped out in the rewrite, for the time being.)

For now, a quick tour of the code:

Code: Select all

`class Environment:`

The methods contain some asserts. Some of those are useful, and some are from chasing down some absolutely shameful bugs introduced in the rewrite.

Code: Select all

```
def add(self, left, right):
...
def multiply(self, left, right):
...
def _combine_terms(self, left, right):
```

Code: Select all

`def cmp(left, right):`

Code: Select all

`class State(collections.namedtuple('State', ['input variables parameters'])):`

Code: Select all

`class Node(tuple):`

Then, a bunch of instantiations of the abstract class. In Rust, this is going to be accomplished with several traits and enums.

In any case, this is all we need to describe the

*structure*of a neural network, including determining the gradient of a loss function, in order to perform gradient descent. Actually evaluating it is another matter, and I'd like to translate this code to Rust first, before trying to re-implement that.

So, where I stand now: I've got a bunch of Python code that passes the tests (not shown), and, aside from the complete lack of documentation and comments, I think looks pretty good. Next week I'm going to talk about trying to port it to Rust, and write about that. That's all for now. See you around!