numpy creates a neural network framework
- 2021-11-24 01:54:42
- OfStack
Introduction to the Framework
Application method and design idea of neural network framework
pytorch is basically imitated on the frame's own handwriting stand to learn the basic algorithms of neural network, such as forward propagation, back propagation, various layers and various activation functions
Using object-oriented thinking to program, the idea is clear
Students who want neural networks can refer to 1
The general framework of the code is clear, but it does not deny that there are ugly parts and poor imitation of pytorch
Project introduction
MINST_recognition:
Handwritten numeral recognition, using MINST data set
The accuracy can reach 93% after 30 rounds of training, and the accuracy can't continue to rise after about 500 rounds of training
RNN_sin_to_cos:The cyclic neural network RNN is used to predict the curve of cos with the curve of sin
At present, there is still bug, which cannot be trained normally
Introduction to the Framework
The code related to the framework is placed in the mtorch folder
Use process
Similar to pytorch, you need to define your own neural network, loss function, gradient descent optimization algorithm, and so on
In every round of training, the sample input is first obtained and then input into its own neural network to obtain the output. Then the predicted results and expected results are given to the loss function calculation loss, and the gradient is calculated by loss. Finally, the parameters of neural network are updated by the optimizer.
Better understood in combination with code:
The following is the body code for handwritten numeral recognition using the MINST dataset
# Define a network define neural network
class DigitModule(Module):
def __init__(self):
# The calculation order will be in the order defined here
sequential = Sequential([
layers.Linear2(in_dim=ROW_NUM * COLUM_NUM, out_dim=16, coe=2),
layers.Relu(16),
layers.Linear2(in_dim=16, out_dim=16, coe=2),
layers.Relu(16),
layers.Linear2(in_dim=16, out_dim=CLASS_NUM, coe=1),
layers.Sigmoid(CLASS_NUM)
])
super(DigitModule, self).__init__(sequential)
module = DigitModule() # Create a model create module
loss_func = SquareLoss(backward_func=module.backward) # Define loss function define loss function
optimizer = SGD(module, lr=learning_rate) # Define Optimizer define optimizer
for i in range(EPOCH_NUM): # Co-training EPOCH_NUM Wheel
trainning_loss = 0 # Calculation 1 Next current 1 Wheel-trained loss Value, you can have no
for data in train_loader: # Traverse through all samples, train_loader Is an iterative object that holds all the data in the dataset
imgs, targets = data # Split data into pictures and labels
outputs = module(imgs) # Input the input value of the sample into its own neural network
loss = loss_func(outputs, targets, transform=True) # Calculation loss / calculate loss
trainning_loss += loss.value
loss.backward() # Calculating gradient by back propagation / calculate gradiant through back propagation
optimizer.step() # Adjust model parameters through optimizer / adjust the weights of network through optimizer
if i % TEST_STEP == 0: # Every training TEST_STEP Wheel test 1 Under the results of current training
show_effect(i, module, loss_func, test_loader, i // TEST_STEP)
print("{} turn finished, loss of train set = {}".format(i, trainning_loss))
Next, I will introduce the written classes one by one. These classes have the same name and function in pytorch, which is modeled after pytorch:
Class Module
Different from pytorch, there can only be one Sequential class (sequence), in which each layer and sequence of neural network are defined, and then passed to the constructor of Module class
Forward propagation: Call forward propagation of Sequential
Back propagation: Back propagation of calling Sequential
So far, most of the functionality of this class is the same as that of Sequential, except that it has a shell to ensure that it is the same as pytorch
There are different loss functions, and the constructor needs to specify the back propagation function of the neural network defined by itself
Calling the loss function returns an object of the Loss class, which records the loss value.
The back propagation computation gradient is achieved by calling the. backward () method of the Loss class
Internal mechanism:
Internally, it actually calls the back propagation function of the neural network defined by itself
It is also a poor imitation of pytorch, which is completely unnecessary. It is good to call it directly through Module
At present, only the random gradient descent SGD is realized
The constructor argument is a self-defined Module. After the gradient has been calculated, call optimizer. step () to change the parameter values of each layer within Module
Internal mechanism:
At present, because there are only SGD1 algorithms, it is only a poor imitation for the time being
That is to say, Module. step () is called under 1, then Module calls Sequential. step (), and finally Sequential calls Layer. step () of each internal layer to realize update
The gradient value is calculated at loss. backward and stored in each layer
There are many different layers
Commonness
Forward propagation:
Accept 1 input for forward propagation calculation and output 1 output
Will save the input and use it in back propagation
Back propagation:
Accept the gradient of the forward propagation output, calculate the gradient of its own parameters (such as w and b in Linear) and save it
The output value is the gradient value of the forward propagation input, which is used to allow the upper layer 1 (which may not be) to continue the back propagation calculation
In this way, different layers can be assembled arbitrarily without hindering forward propagation and back propagation
. step method
Update its own parameter values (or not, such as activation layer and pooling layer)
This class is also inherited from Layer and can be used as Layer 1
It assembles multiple layers into one in sequence, and calculates them in sequence when propagating forward and backward
Combined with its forward, backward methods to understand:
def forward(self, x):
out = x
for layer in self.layers:
out = layer(out)
return out
def backward(self, output_gradiant):
layer_num = len(self.layers)
delta = output_gradiant
for i in range(layer_num - 1, -1, -1):
# Reverse traversal of layers , Back propagating the desired change
delta = self.layers[i].backward(delta)
def step(self, lr):
for layer in self.layers:
layer.step(lr)
RNN Class: Cyclic Neural Network Layer
Inherited from Layer, it is described separately under 1 because of its complicated content
The RNN consists of a fully connected layer Linear and an active layer
Forward propagation
def forward(self, inputs):
"""
:param inputs: input = (h0, x) h0.shape == (batch, out_dim) x.shape == (seq, batch, in_dim)
:return: outputs: outputs.shape == (seq, batch, out_dim)
"""
h = inputs[0] # Input inputs Consists of two parts
X = inputs[1]
if X.shape[2] != self.in_dim or h.shape[1] != self.out_dim:
# Check whether there is a problem with the entered shape
raise ShapeNotMatchException(self, "forward: wrong shape: h0 = {}, X = {}".format(h.shape, X.shape))
self.seq_len = X.shape[0] # Length of time series
self.inputs = X # Save the input, and then use the back propagation
output_list = [] # Save the output at each point in time
for x in X:
# Traversing in time series input
# x.shape == (batch, in_dim), h.shape == (batch, out_dim)
h = self.activation(self.linear(np.c_[h, x]))
output_list.append(h)
self.outputs = np.stack(output_list, axis=0) # Convert a list to 1 Save the matrix
return self.outputs
Back propagation
def backward(self, output_gradiant):
"""
:param output_gradiant: shape == (seq, batch, out_dim)
:return: input_gradiant
"""
if output_gradiant.shape != self.outputs.shape:
# Expect to get (seq, batch, out_dim) Shape
raise ShapeNotMatchException(self, "__backward: expected {}, but we got "
"{}".format(self.outputs.shape, output_gradiant.shape))
input_gradients = []
# Each time_step Virtual on weight_gradient, The final average is the total weight_gradient
weight_gradients = np.zeros(self.linear.weights_shape())
bias_gradients = np.zeros(self.linear.bias_shape())
batch_size = output_gradiant.shape[1]
# total_gradient: When propagating forward, it is x, h Synthesize 1 So back propagation also calculates the gradient of this large matrix and then splits it into x_grad, h_grad
total_gradient = np.zeros((batch_size, self.out_dim + self.in_dim))
h_gradient = None
# The gradient value of each time layer is calculated by traversing each time layer in reverse
for i in range(self.seq_len - 1, -1, -1):
# Forward propagation sequence : x, h -> z -> h
# So the back propagation calculation order: h_grad -> z_grad -> x_grad, h_grad, w_grad, b_grad
# %%%%%%%%%%%%%% Average version %%%%%%%%%%%%%%%%%%%%%%%
# h_gradient = (output_gradiant[i] + total_gradient[:, 0:self.out_dim]) / 2
# %%%%%%%%%%%%%% Versions that do not calculate averages %%%%%%%%%%%%%%%%%%%%%%%
# Calculation h_grad: This 1 Point-in-time h_grad Object that includes the output grad And the previous time point grad Two parts
h_gradient = output_gradiant[i] + total_gradient[:, 0:self.out_dim]
# w_grad And b_grad Is in linear.backward() If you calculate it inside, you don't need to calculate it manually
z_gradient = self.activation.backward(h_gradient) # Calculation z_grad
total_gradient = self.linear.backward(z_gradient) # Calculation x_grad And h_grad Gradient of composite large matrix
# total_gradient At the same time, it includes h And x Adj. gradient, shape == (batch, out_dim + in_dim)
x_gradient = total_gradient[:, self.out_dim:]
input_gradients.append(x_gradient)
weight_gradients += self.linear.gradients["w"]
bias_gradients += self.linear.gradients["b"]
# %%%%%%%%%%%%%%%%%% Average version %%%%%%%%%%%%%%%%%%%%%%%
# self.linear.set_gradients(w=weight_gradients / self.seq_len, b=bias_gradients / self.seq_len)
# %%%%%%%%%%%%%%%%%% Versions that do not calculate averages %%%%%%%%%%%%%%%%%%%%%%%
self.linear.set_gradients(w=weight_gradients, b=bias_gradients) # Setting Gradient Values
list.reverse(input_gradients) # input_gradients Is in reverse order, and it is required for the final output reverse1 Under
print("sum(weight_gradients) = {}".format(np.sum(weight_gradients)))
# np.stack Is to transform the list into 1 Matrix
return np.stack(input_gradients), h_gradient
The above is numpy to create a neural network framework details, more information about numpy neural network please pay attention to other related articles on this site!