python sklearn Library Realization Simple Logistic Regression Example Code

  • 2021-07-06 11:25:05
  • OfStack

Introduction to Sklearn

Scikit-learn (sklearn) is the third module commonly used in machine learning, which encapsulates the commonly used machine learning methods, including regression (Regression), dimension reduction (Dimensionality Reduction), classification (Classfication), clustering (Clustering) and so on. When we are faced with the problem of machine learning, we can choose the corresponding method according to the following figure.

Sklearn has the following characteristics:

Simple and efficient data mining and data analysis tools Enable everyone to reuse in complex environments Build on NumPy, Scipy and MatPlotLib

The code looks like this:


import xlrd
import matplotlib.pyplot as plt
import numpy as np
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
data = xlrd.open_workbook('gua.xlsx')
sheet = data.sheet_by_index(0)
Density = sheet.col_values(6)
Sugar = sheet.col_values(7)
Res = sheet.col_values(8)
#  Read the original data 
X = np.array([Density, Sugar])
# y The size of is (17,)
y = np.array(Res)
X = X.reshape(17,2)
#  Drawing classification data 
f1 = plt.figure(1)
plt.title('watermelon_3a')
plt.xlabel('density')
plt.ylabel('ratio_sugar')
#  Draw scatter plot ( x The axis is density, y Axis is sugar content) 
plt.scatter(X[y == 0,0], X[y == 0,1], marker = 'o', color = 'k', s=100, label = 'bad')
plt.scatter(X[y == 1,0], X[y == 1,1], marker = 'o', color = 'g', s=100, label = 'good')
plt.legend(loc = 'upper right')
plt.show()
#  Select from the original data 1 Semi-data training, another 1 Semi-data test 
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.5, random_state=0)
#  Logistic regression model 
log_model = LogisticRegression()
#  Training logistic regression model 
log_model.fit(X_train, y_train)
#  Forecast y Value of 
y_pred = log_model.predict(X_test)
#  View test results 
print(metrics.confusion_matrix(y_test, y_pred))
print(metrics.classification_report(y_test, y_pred))

Summarize


Related articles: