Realization of Stochastic Gradient Descent by python (SGD)

  • 2020-06-15 09:40:34
  • OfStack

The stochastic gradient descent algorithm should be realized by using neural network for sample training. Here, Based on the explanation of Peng Liang, a teacher from Maizi University, I summarize as follows (the structure of neural network has been defined in another blog) :


def SGD(self, training_data, epochs, mini_batch_size, eta, test_data=None):
 if test_data:
  n_test = len(test_data)# How many test sets are there 
  n = len(training_data)
  for j in xrange(epochs):
   random.shuffle(training_data)
   mini_batches = [
    training_data[k:k+mini_batch_size] 
    for k in xrange(0,n,mini_batch_size)]
   for mini_batch in mini_batches:
    self.update_mini_batch(mini_batch, eta)
   if test_data:
    print "Epoch {0}: {1}/{2}".format(j, self.evaluate(test_data),n_test)
   else:
    print "Epoch {0} complete".format(j) 

training_data is a training set, which is composed of many tuples (tuples). Each tuple (x, y) represents one instance, x is the vector representation of the image, and y is the category of the image.
epochs is how many rounds of training.
mini_batch_size represents the number of instances per training.
eta stands for learning rate.
test_data represents the test set.
The more important function is self.update_mini_batch, which is the key function for updating weights and biases, which is defined next.


def update_mini_batch(self, mini_batch,eta): 
 nabla_b = [np.zeros(b.shape) for b in self.biases]
 nabla_w = [np.zeros(w.shape) for w in self.weights]
 for x,y in mini_batch:
  delta_nabla_b, delta_nable_w = self.backprop(x,y)# Target function pair b and w The partial derivative 
  nabla_b = [nb+dnb for nb,dnb in zip(nabla_b,delta_nabla_b)]
  nabla_w = [nw+dnw for nw,dnw in zip(nabla_w,delta_nabla_w)]# cumulative b and w
 # The final update weight is 
 self.weights = [w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)]
 self.baises = [b-(eta/len(mini_batch))*nb for b, nb in zip(self.baises, nabla_b)]

The update_mini_batch function updates the weights and biases of the neural network based on the 1 data you pass in.


Related articles: