python machine learning logistic regression stochastic gradient descent method

2021-12-13 08:19:47
OfStack

Catalogue written above random gradient descent method references

Write at the front

The random gradient descent method is on the random gradient. It means that when we are at the initial point, we want to find the gradient of the next point, which is random. The whole batch gradient descent is sequential from 1 point to 1 point, and all data points require gradient and sequence.

Although the gradient decline of the whole batch is stable, the speed is slow;

Although SGD is fast, it is not stable enough

Random gradient descent method

Random gradient descent method (Stochastic Gradient Decent,
SGD) is an improved algorithm for the computational efficiency of the full batch gradient descent method. In essence, we expect that the results obtained by the random gradient descent method are similar to those obtained by the full batch gradient descent method; The advantage of SGD is that the gradient is calculated faster.

Code


'''
随机梯度下降法（Stochastic Gradient Decent, SGD）
是对全批量梯度下降法计算效率的改进算法。本
质上来说，我们预期随机梯度下降法得到的结果和全批量梯度下降法相接近；
SGD的优势是更快地计算梯度。
'''
import pandas as pd
import numpy as np
import os
os.getcwd()
# F:\\pythonProject3\\data\\data\\train.csv
# dataset_path = '..'
# 这是1个全批量梯度下降（full-batch gradient descent）的应用。
# 这个问题是1个回归问题
# 我们给出美国某大型问答社区从2010年10月1日到2016年11月30日，
# 每天新增的问题的个数和回答的个数。
# 任务是预测2016年12月1日到2017年5月1日，该问答网站每天新增的问题数和回答数。
train = pd.read_csv('..\\train.csv')
# 导入数据
# train = pd.read_csv('train.csv')
test = pd.read_csv('..\\test.csv')
submit = pd.read_csv('..\\sample_submit.csv')
path1=os.path.abspath('.')
print("path1@@@@@",path1)
path2=os.path.abspath('..')
print("path2@@@@@",path2)
print(train)
# 初始设置
beta = [1,1] #初始点
alpha = 0.2 #学习率，也就是步长
tol_L = 0.1 #阈值，也就是精度
# 对x进行归1化,train 是训练数据的2维表格
max_x = max(train['id']) #max_x是总共的id数
x = train['id'] / max_x #所有的id都除于max_x
y = train['questions'] # train2维表格中的questions列赋给y
type(train['id'])
print("train['id']#######\n",train['id'])
print("type(train['id'])###\n\n",x)
print("max_x#######",max_x)
#为了计算方向
def compute_grad_SGD(beta, x, y):
    '''
    :param beta: 是初始点
    :param x: 是自变量
    :param y: 是真是值
    :return: 梯度数组
    '''
    grad = [0, 0]
    r = np.random.randint(0, len(x)) #在0-len(x)之间随机生成1个数
    grad[0] = 2. * np.mean(beta[0] + beta[1] * x[r] - y[r]) #求beta[1,1]，中第1个数的梯度
    grad[1] = 2. * np.mean(x * (beta[0] + beta[1] * x - y))#求beta[1,1]，中第2个数的梯度
    return np.array(grad)
#为了计算下1个点在哪，
def update_beta(beta, alpha, grad):
    '''
    :param beta: 第1点，初始点
    :param alpha: 学习率，也就时步长
    :param grad: 梯度
    :return:
    '''
    new_beta = np.array(beta) - alpha * grad
    return new_beta
# 定义计算RMSE的函数
# 均方根误差（RMSE）
def rmse(beta, x, y):
    squared_err = (beta[0] + beta[1] * x - y) ** 2 # beta[0] + beta[1] * x是预测值，y是真实值，
    res = np.sqrt(np.mean(squared_err))
    return res
# 进行第1次计算
grad = compute_grad_SGD(beta, x, y) #调用计算梯度函数，计算梯度
loss = rmse(beta, x, y) #调用损失函数，计算损失
beta = update_beta(beta, alpha, grad) #更新下1点
loss_new = rmse(beta, x, y) #调用损失函数，计算下1个损失
# 开始迭代
i = 1
while np.abs(loss_new - loss) > tol_L:
    beta = update_beta(beta, alpha, grad)
    grad = compute_grad_SGD(beta, x, y)
    if i % 100 == 0:
        loss = loss_new
        loss_new = rmse(beta, x, y)
        print('Round %s Diff RMSE %s'%(i, abs(loss_new - loss)))
    i += 1
print('Coef: %s \nIntercept %s'%(beta[1], beta[0]))
res = rmse(beta, x, y)
print('Our RMSE: %s'%res)
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(train[['id']], train[['questions']])
print('Sklearn Coef: %s'%lr.coef_[0][0])
print('Sklearn Coef: %s'%lr.intercept_[0])
res = rmse([936.051219649, 2.19487084], train['id'], y)
print('Sklearn RMSE: %s'%res)

References

Random gradient descent method