Artificial Neural Networks
Artifical Neural Networks (ANNs) are a key part of machine learning. We can see how they work by building a toy example.
This post shows two variations of ANNs, in two languages. First in Python (with the NumPy library) and then in J. These implementations are based on the code in this post.
Both of these languages are high-level, highly dynamic languages. Python's strengths include its extensive set of libraries, and the ease of extending the language both in terms of syntax and by binding to compiled C code. J's major strengths include its fundamental array-oriented paradigm, and the ease of composing its large set of primitive operations.
one-layer
One-layer networks can estimate linearly separable functions. However, they can not predict nonlinear functions.
Linear functions include most of the 2-input boolean functions:
input | and | or | x → y |
---|---|---|---|
0 0 | 0 | 0 | 1 |
0 1 | 0 | 1 | 0 |
1 0 | 0 | 1 | 1 |
1 1 | 1 | 1 | 1 |
import numpy as np X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]]) y = np.array([[0,1,1,1]]).T np.random.seed(1) w0 = 2*np.random.random((3,1)) - 1 for i in range(10000): l1 = 1/(1+np.exp(-(X @ w0))) l1_err = y - l1 l1_del = l1_err * l1 * (1 - l1) w0 += X.T @ l1_del print(l1)
[[0.01627589] [0.98974255] [0.98974254] [0.99999822]]
input =: 4 3 $ 0 0 1 0 1 1 1 0 1 1 1 1 target =: 4 1 $ 0 1 1 1 dot =: +/ .* sig =: {{ %1+^-y }} train =: {{ l1 =. sig input dot y l1_err =. target - l1 l1_del =. l1_err * l1 * 1 - l1 y + (|:input) dot l1_del }} 5j3": sig input dot train^:10000 <:+:?.3 1$0
two-layer
Two-layer networks are more versatile, and can estimate nonlinear functions. The classic example of this is the boolean XOR function:
input | xor |
---|---|
0 0 | 0 |
0 1 | 1 |
1 0 | 1 |
1 1 | 0 |
A single neuron can't estimate this function, no matter how hard it tries. However, 2 neurons in sequence handle it just fine.
import numpy as np X = np.array([[0,0,1], [0,1,1], [1,0,1], [1,1,1]]) y = np.array([[0,1,1,0]]).T np.random.seed(1) w0 = 2*np.random.random((3,4)) - 1 w1 = 2*np.random.random((4,1)) - 1 for j in range(10000): l1 = 1/(1+np.exp(-(X @ w0))) l2 = 1/(1+np.exp(-(l1 @ w1))) l2_error = y - l2 l2_delta = l2_error * l2 * (1 - l2) l1_error = l2_delta @ w1.T l1_delta = l1_error * l1 * (1 - l1) w0 += X.T @ l1_delta w1 += l1.T @ l2_delta print(np.round(l2,3))
[[0.007] [0.991] [0.992] [0.01 ]]
input =: 4 3 $ 0 0 1 0 1 1 1 0 1 1 1 1 target =: 4 1 $ 0 1 1 0 dot =: +/ .* sig =: {{ %>:^-y }} train =: {{ 'ignore_me w0 w1' =. y l1 =. sig input dot w0 l2 =. sig l1 dot w1 l2_error =. target - l2 l2_delta =. l2_error * l2 * 1 - l2 l1_error =. l2_delta dot |: w1 l1_delta =. l1_error * l1 * 1 - l1 w0 =. w0 + (|:input) dot l1_delta w1 =. w1 + (|:l1) dot l2_delta l2;w0;w1 }} 5j3":0{::train^:10000 {{<:+:?.y$0}} each 1 ; 3 4 ; 4 1
0.009 0.990 0.990 0.011