python sklearn Library Realization Simple Logistic Regression Example Code
- 2021-07-06 11:25:05
- OfStack
Introduction to Sklearn
Scikit-learn (sklearn) is the third module commonly used in machine learning, which encapsulates the commonly used machine learning methods, including regression (Regression), dimension reduction (Dimensionality Reduction), classification (Classfication), clustering (Clustering) and so on. When we are faced with the problem of machine learning, we can choose the corresponding method according to the following figure.
Sklearn has the following characteristics:
Simple and efficient data mining and data analysis tools Enable everyone to reuse in complex environments Build on NumPy, Scipy and MatPlotLibThe code looks like this:
import xlrd
import matplotlib.pyplot as plt
import numpy as np
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
data = xlrd.open_workbook('gua.xlsx')
sheet = data.sheet_by_index(0)
Density = sheet.col_values(6)
Sugar = sheet.col_values(7)
Res = sheet.col_values(8)
# Read the original data
X = np.array([Density, Sugar])
# y The size of is (17,)
y = np.array(Res)
X = X.reshape(17,2)
# Drawing classification data
f1 = plt.figure(1)
plt.title('watermelon_3a')
plt.xlabel('density')
plt.ylabel('ratio_sugar')
# Draw scatter plot ( x The axis is density, y Axis is sugar content)
plt.scatter(X[y == 0,0], X[y == 0,1], marker = 'o', color = 'k', s=100, label = 'bad')
plt.scatter(X[y == 1,0], X[y == 1,1], marker = 'o', color = 'g', s=100, label = 'good')
plt.legend(loc = 'upper right')
plt.show()
# Select from the original data 1 Semi-data training, another 1 Semi-data test
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.5, random_state=0)
# Logistic regression model
log_model = LogisticRegression()
# Training logistic regression model
log_model.fit(X_train, y_train)
# Forecast y Value of
y_pred = log_model.predict(X_test)
# View test results
print(metrics.confusion_matrix(y_test, y_pred))
print(metrics.classification_report(y_test, y_pred))
Summarize